OWASP/pytm

Expanding seq diagrams

mikejarrett opened this issue · 4 comments

First apologies for the long winded issue...

tl;dr - Are there any plans in extending out some plantuml sequence diagram functionality?

I've been using pytm for about a week now and have found it to be a pretty good tool. Coming more from a security side I am more focused on the dataflow diagram and threat modelling report however I had a lot of requests to add bits and pieces to the the seq diagram as well (e.g. lifelines, queue participants, dividers, arrow style).

I'm happy to make a pull request with a few suggested changes (still wrapping my head around how it all ties together)

Queues

The easiest being:

class Queue(Datastore):
    pass


class TM:
    ...
    def seq(self):
        for e in TM._elements:
            ...
            elif isinstance(e, Queue):  # this would need to go before the `Datastore` check
                value = 'queue {0} as "{1}"'.format(e._uniq_name(), e.display_name())

I think the more robust, and probably more extensible way would be another attribute and, ideally, a new method to handle the line that will be formatted / printed in seq()

class Element:
    def seq_line(self):  # naming things is difficult
        return 'entity {} as "{}"'.format(self._uniq_name(), self.display_name()'

class Actor(Element):
    def seq_line(self): 
        return 'actor {} as "{}"'.format(self._uniq_name(), self.display_name()'

class Datastore(Element):
   def seq_line(self):
        if self.isQueue:
            puml_participant = "queue"
        else:
            puml_participant = "database"
        return '{} {} as "{}"'.format(puml_participant, self._uniq_name(), self.display_name()'

class TM:
    def seq(self):
        participants = [e.seq_line() for e in TM._elements if not isinstance(e, (Dataflow, Boundary)]
        ...
...

 my_queue = Datastore("my_queue", isQueue=True)

Dataflow arrows

I think the simplest would be to add new attribute arrowStyle, implement the suggested seq_line from above and instantiate a dataflow the following way to get a dotted blue, open arrow line for the lines in the sequence digramt:

class Dataflow(Element):
    ...
    arrowStyle = varString("->")
    ...

    def seq_line(self):
        note = "\nnote left\n{}\nend note".format(self.note) if not self.note else ""
        line =  "{source} {arrow} {sink}: {display_name}{note}".format(
            source=self.source._uniq_name(),
            arrow=self.arrowStyle,
            sink=self.sink._uniq_name(),
            note=note,
        )

...
df = Dateflow(source, sink, "My dataflow", arrowStyle="-[#blue]->>")

Lifelines

This one I am still exploring and not confident in any implementation yet but maybe something like

class TM:
    ...
    includeSeqLifelines = varBool(False)
    ...

    def seq(self):
        ...
        messages = []
        for e in TM._flows:
            if e.response and self.includeSeqLifelines:  # at the start of the loop
                messages.append('activate {}\n'.format(e.sink._uniq_name()))
            ... all the other flow stuff here ...
            if e.responseTo and self.includeSeqLifelines:  # at the end of the loop
                messages.append('deactivate {}\n'.format(e.responseTo.sink._uniq_name()))

Or introduce a concept of SeqLifelines, again not happy with the exploration I've done so far but quick back of a napkins code:

class _Dummy:
    """A temporary dummy class that allows me to insert SeqLifelines in the flows portion of TM.

    This is where a lot of my uncertaintity comes in. Obviously, if I implement I would properly fix up the 
    SeqLifeline to work properly
    """

    data = []
    levels = {0}
    overrides = []
    protocol = "HTTPS"
    port = 443
    authenticatesDestination = True
    checksDestinationRevocation = True
    name = "Dummy"

class Lifeline(Enum):
    ACTIVATE = "activate"
    DEACTIVATE = "deactivate"
    DESTROY = "destroy"

class varLifeline(var):
    def __set__(self, instance, value):
        if not isinstance(value, Lifeline):
            raise ValueError("expecting a Lifetime, got a {}".format(type(value)))
        super().__set__(instance, value)

class SeqLifeline(Element):
    name = varString("", required=True)
    action = varLifeline(None, required=True)
    participant = varElement(None, required=True)
    color = varString("", required=False, doc="Color variable for the lifeline")
    source = _Dummy
    sink = _Dummy
    data = []
    responseTo = None
    isResponse = False
    response = None
    order = -1

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        TM._flows.append(self)

    def seq_line(self):
        return '{} {}\n'.format(self.action.value, self.participant._uniq_name())

    def dfd(self, **kwargs) -> str:
        """SeqLifeline shouldn't show on dfd's but we do want them to render on seq diagrams."""
        return ""

Other ideas

  • Add a varNote class that allows defining shape, color, location see Notes on messages for attributes we could declare
  • Introduce SeqNewPage and the ability write seq() out to separate files
    • Ability to add hide unlinked (would only work with new page)
  • Introduce SeqDivider see Divider or sperator
  • Add ability group participants on boundaries Participants encompass

Go with seq_line and arrow_style. Lifelines should be a separate issue. I haven't used seq diagrams too much but I'm excited to see new contributions 😊

izar commented

First of all thanks for coming up with the suggestions!

My use of sequence diagrams has been limited, mostly to understand, well, the sequence of events. I wonder if with this proposal we are not breaking the philosophy of "one small tool doing well what it is supposed to do" and overloading pytm with plantuml functionality. In other words, what do these additions represent in terms of a threat model, and how would they lead to the elicitation of new threats? In other words, are there threat rules that map to the usage of these symbols?

Hello! Yeah it has been fun wrapping my head around pytm.

To be honest, I was hesitant to suggest adding sequence functionality because it doesn't add directly to threat modelling and was having the same internal debate with myself as well about this very thing (does one thing well).

For me, the use case is we have some asynchronous calls and some producer-consumer patterns that we wanted to highlight differently in the sequence diagrams then the standard calls that were being made.

Visually, at least to me, it calls attention and maybe is a signal to add more controls around these calls or scrutinize implementation but I'll admit, I am struggling to find strong argument for any producer-consumer or asynchronous threats (e.g. DoS or privileged execution) which wouldn't already be covered by hasAccessControl, validatesInput, sanitizesInput or handlesResourceConsumption.

I think sequence diagrams (And the more detailed, expressive the better) are an excellent threat modelling resource. While I like DFDs for understanding transitions between trust domains, it's the sequence diagrams that really help me understand exactly what is going on in a given subsytem or in response to a particular command.

+1 to improvements here.