requesting design feedback
sydneypemberton1986 opened this issue · 2 comments
Hey folks!
First off, thank you for writing such a wonderful tool!
I work for findhelp.com and we just had our 2023 hackathon, in which we worked on leveraging import-linter to create a Github Action to ensure that the modular structure of our architecture is not degraded by pull requests.
We ended up writing our own custom Contract, which was good enough for the hackathon to get a working proof of concept.
But now we want to make a more complete offering and we wanted to get your feedback on the direction we are going, and if anything we are working on would be useful pull back into the import-linter itself.
Our codebase was designed to be a layered architecture but that was not enforced so it would not make sense to just hook the return code of pass or fail into our CI/CD pipeline. Instead we want to fail a build if the number of disallowed imports increases. So the first thing we did was go in and change the output to machine readable JSON so we could parse out the number of failures.
Furthermore, we are slowly working on splitting our architecture into "macro services". This means that a simple layered contract was insufficient for us, but instead we needed to implement a graph of allowed imports.
We also had a need to reduce various disjoint modules into conceptually unified modules. For example:
- Flask, httplib, etc and label it as "front-end libraries". (we only want our controller code working with the HTTP abstraction)
- datastore and data_access packages and label it as "persistence layer" (it would be a big lift to reorganize this structure at this point)
- sqlalchemy and ndb as "database libraries" (we only want our persistence code working with a database abstraction)
Finally, we wanted to be able to see the current state of the codebase, or possibly even produce an animation of our modular structure improving over the history of the project.
To solve these issues we are working on writing a Contract that looks a bit like this
import json
from collections import defaultdict
from pathlib import Path
from typing import Any, Dict, Iterable, List, Set
from grimp.adaptors.graph import ImportGraph
from importlinter import Contract, ContractCheck, fields, output
class ReducingGraphContract(Contract):
"""Checks the overall modular structure of the imports of a codebase.
A Contract that reduces the import graph into a simplified graph reflecting
the overall modular structure of a project and then presents the degree to which
imports do not honor the modular structure.
I would like to get this to a state where it takes as configuration input the following:
- A [graph homomorphism][1] from the actual modules to the desired groupings
in the form of a dictionary from the resulting simplified modules to
the individual modules they combine.
Example:
{
"front_end_libraries": [
"httplib",
"webapp2",
...
],
"referrals_persistence": [
"core.referrals.datastore",
"core.referrals.data_access",
],
"referrals_business": [
"core.referrals.services",
],
"persistence_libraries" : [
"sqlalchemy",
"google.appengine.ext.ndb",
...
],
...
}
- A graph of allowed relations between the desired groupings.
This could be represented as either an [adjacency matrix][2] or an [adjacency list][3].
An adjacency list would be easier to reason about for most folks,
and it would be more compact because this graph is bound to be sparse.
You could represent this as a list of tuples, for example.
[
("controllers", "business"),
("business", "persistence"),
("team_a.services", "team_b.services"),
...
]
Alternatively, an adjacency matrix would make it easier to later add weights to edges.
This would en able adding more sophisticated metrics.
Imports are located in a layer and import from a layer
You can set up an adjacency matrix A weighted with costs
between each of controllers (c), services (s), and persistence(p).
A cost of 0 indicates an allowed transition.
Acc Acs Acp
Asc Ass Asp
Apc Aps App
import numpy as np
costs = np.array([
[0, 0, 1],
[2, 0, 0],
[3, 2, 0]
])
Then the output of the import linter could also be formatted as a matrix,
making the calculation of total cost trivial
counts = np.array([
[1, 25, 1],
[1, 20, 9],
[0, 0, 12],
])
total_cost = np.sum(np.multiply(costs, counts))
Both of these graph representations shouldn't be too hard to convert to a d3.js representation,
which can produce some [pretty awesome graph displays][4].
1: https://en.wikipedia.org/wiki/Graph_homomorphism
2: https://mathworld.wolfram.com/AdjacencyMatrix.html
3: https://en.wikipedia.org/wiki/Adjacency_list
4: https://observablehq.com/@d3/force-directed-graph
"""
...
Furthermore, since the data we are going to be using to specify these inputs is complex, we are also playing with a JSON configuration file format, something like this:
commit f44d3e850dc4c0e2d0f0aec52b0625a614ed0221
Author: Sydney Pemberton <spemberton@findhelp.com>
Date: Wed Mar 29 10:26:56 2023 -0500
add a json file
diff --git src/importlinter/adapters/user_options.py src/importlinter/adapters/user_options.py
index 76fefa3..6fc9140 100644
--- src/importlinter/adapters/user_options.py
+++ src/importlinter/adapters/user_options.py
@@ -1,4 +1,5 @@
import configparser
+import json
from typing import Any, Dict, Optional, List
import abc
import sys
@@ -97,13 +98,29 @@ class TomlFileUserOptionReader(AbstractUserOptionReader):
contracts = session_options.pop("contracts", [])
- self._normalize_booleans(session_options)
+ _normalize_booleans(session_options)
for contract in contracts:
- self._normalize_booleans(contract)
+ _normalize_booleans(contract)
return UserOptions(session_options=session_options, contracts_options=contracts)
- def _normalize_booleans(self, data: dict) -> None:
- for key, value in data.items():
- if isinstance(value, bool):
- data[key] = str(value)
+
+class JsonFileUserOptionReader(AbstractUserOptionReader):
+ potential_config_filenames = ["importlinter.json", ".importlinter.json"]
+
+ def _read_config_filename(self, config_filename: str) -> Optional[UserOptions]:
+ file_contents = settings.FILE_SYSTEM.read(config_filename)
+ data = json.loads(file_contents)
+ contracts = data.pop("contracts", [])
+
+ _normalize_booleans(data)
+ for contract in contracts:
+ _normalize_booleans(contract)
+
+ return UserOptions(session_options=data, contracts_options=contracts)
+
+
+def _normalize_booleans(data: dict) -> None:
+ for key, value in data.items():
+ if isinstance(value, bool):
+ data[key] = str(value)
diff --git src/importlinter/configuration.py src/importlinter/configuration.py
index b56198f..32e8cac 100644
--- src/importlinter/configuration.py
+++ src/importlinter/configuration.py
@@ -2,7 +2,7 @@ from .adapters.building import GraphBuilder
from .adapters.filesystem import FileSystem
from .adapters.printing import ClickPrinter
from .adapters.timing import SystemClockTimer
-from .adapters.user_options import IniFileUserOptionReader, TomlFileUserOptionReader
+from .adapters.user_options import IniFileUserOptionReader, TomlFileUserOptionReader, JsonFileUserOptionReader
from .application.app_config import settings
@@ -11,6 +11,7 @@ def configure():
USER_OPTION_READERS={
"ini": IniFileUserOptionReader(),
"toml": TomlFileUserOptionReader(),
+ "json": JsonFileUserOptionReader(),
},
GRAPH_BUILDER=GraphBuilder(),
PRINTER=ClickPrinter(),
diff --git tests/unit/adapters/test_user_options.py tests/unit/adapters/test_user_options.py
index a7f4ba9..064bdaf 100644
--- tests/unit/adapters/test_user_options.py
+++ tests/unit/adapters/test_user_options.py
@@ -1,6 +1,6 @@
import pytest
-from importlinter.adapters.user_options import IniFileUserOptionReader, TomlFileUserOptionReader
+from importlinter.adapters.user_options import IniFileUserOptionReader, TomlFileUserOptionReader, JsonFileUserOptionReader
from importlinter.application.app_config import settings
from importlinter.application.user_options import UserOptions
from tests.adapters.filesystem import FakeFileSystem
@@ -192,3 +192,69 @@ def test_toml_file_reader(contents, expected_options):
options = TomlFileUserOptionReader().read_options()
assert expected_options == options
+
+
+@pytest.mark.parametrize(
+ "contents, expected_options",
+ (
+ (
+ "{}",
+ UserOptions(session_options={}, contracts_options=[]),
+ ),
+ (
+ """
+ {
+ "foo": "hello",
+ "bar": 999
+ }
+ """,
+ UserOptions(session_options={"foo": "hello", "bar": 999}, contracts_options=[]),
+ ),
+ (
+ """
+ {
+ "foo": "hello",
+ "include_external_packages": true,
+ "contracts": [
+ {
+ "id": "contract-one",
+ "name": "Contract One",
+ "key": "value",
+ "multiple_values": [
+ "one",
+ "two",
+ "three",
+ "foo.one -> foo.two"
+ ]
+ },
+ {
+ "name": "Contract Two",
+ "baz": 3
+ }
+ ]
+ }
+ """,
+ UserOptions(
+ session_options={"foo": "hello", "include_external_packages": "True"},
+ contracts_options=[
+ {
+ "name": "Contract One",
+ "id": "contract-one",
+ "key": "value",
+ "multiple_values": ["one", "two", "three", "foo.one -> foo.two"]
+ },
+ {"name": "Contract Two", "baz": 3}
+ ],
+ ),
+ ),
+ ),
+)
+def test_json_file_reader(contents, expected_options):
+ settings.configure(
+ FILE_SYSTEM=FakeFileSystem(
+ content_map={"/path/to/folder/.importlinter.json": contents},
+ working_directory="/path/to/folder",
+ )
+ )
+ options = JsonFileUserOptionReader().read_options()
+ assert expected_options == options
What do y'all think?
Are we on the right track?
Is our approach sound?
Is anything here useful to port to your project?
Hi Sydney, thanks for your message! Sounds really interesting what you're doing.
My immediate reaction is that it's possible you could achieve your aims in a simpler way, but I might be missing something.
The fundamental requirement, which is to fail if the number of disallowed imports increases, is probably better done by having a passing contract that contains ignored imports. If you commit your contract to version control, then any PR that introduces a new layering violation will start failing (unless the commiter also adds an ignored import). This makes it easy to see whether or not things are getting worse.
If you want to make further tooling around this, there is actually a Python API for reading the contracts. You can use this for, say, recording a metric of the number of ignored imports in your contract - or even a custom Github action that prevents merge if the number increases.
Does that help?
Thanks. I'll close this since it's not really actionable and I responded in my PR..