/producer

Smalltalk to Objective-C source-to-source compiler by Brad Cox and Kurt Schmucker (from comp.sources.misc 08/20/87)

Primary LanguageObjective-CThe UnlicenseUnlicense

This is a copy of the "Producer" Smalltalk-80 to Objective-C Translator source code by Brad J. Cox and Kurt J. Schmucker

Related Paper:

Brad J. Cox and Kurt J. Schmucker Producer: A Tool for Translating Smalltalk-80 to Objective-C ACM SIGPLAN Notices Volume 22 Issue 12 pp 423–429, 1987 https://doi.org/10.1145/38807.38846 https://dl.acm.org/doi/pdf/10.1145/38807.38846

The sources were posted to the USENET newsgroup comp.source.misc on August 20th, 1987, by by Dieter H. Zebbedies (dietz@zhmti.UUCP)

https://usenet.trashworldnews.com/?thread=639838

The original README follows...


            Producer: Smalltalk-80 to Objective-C Translator


                              Brad J. Cox
                  Productivity Products International
                              75 Glen Road
                          Sandy Hook, CT 06482
                            (203) 426 1875.


      Smalltalk-80 is a tool for  turning  raw  concepts  into  working
 software prototypes. Objective-C is a tool for turning proven concepts
 into fast, commercial-quality, production systems. Producer is a  tool
 for  bridging  the gap between prototyping and production by automati-
 cally translating Smalltalk-80 sources into Objective-C  sources.  The
 translation is guided by a rule base in which the programmer describes
 how differences between the Smalltalk-80 prototyping  environment  and
 the   Objective-C  production  environment  should  be  resolved  when
 translating the code.

      At SIGGRAPH-87, PPI will announce a  library  of  user  interface
 components  from which programmers build applications with iconic user
 interfaces.  The library and applications built using it are  portable
 across  diverse  window  systems,  initially X-Windows, SunWindows and
 Hewlett Packard's window system. While the Objective-C user  interface
 classes  are  different from Smalltalk's, they are similar enough that
 Producer can usually bridge the differences with some  hand-tuning  of
 the  translated  output.   We  confidently hope that Objective-C, this
 library and Producer will make automatic translation  of  Smalltalk-80
 prototypes  a  routine  part  of  many companies' software development
 lifecycle.

      I'm distributing Producer to enlist  your  help  in  testing  the
 practicality of this notion.



                               Disclaimer


      Producer is not a mature software  product  but  an  embryo  that
 could  grow  to maturity someday.  Specifically it is not supported or
 warranteed in any  way.  It  was  written  by  myself,  an  individual
 employed  by PPI, and has been released prior to maturity by myself as
 an individual with the consent of the  company.   This  document  will
 make its strengths and some of its present shortcomings clear.

      However, even in its present state,  Producer  demonstrates  that
 automatic  translation  is technically feasible and its present imple-
 mentation provides a capable foundation on which to build.  Since  the
 market  for Smalltalk-80 translators is insufficient for PPI to pursue
 presently, we've released Producer for you to make what use of it  you
 can.

      I do ask that you keep me informed of your experiences  in  using
 it  in  its  current  state,  and  PPI requests that you feed back any


 Brad Cox                          1                      June 22, 1987







            Producer: Smalltalk-80 to Objective-C Translator


 improvements so that we can offer a fully supported  translation  pro-
 duct in the future. PPI retains the copyright and all other applicable
 rights. For example, you may not sell products that contain  any  part
 of the Producer distribution without PPI's permission.



                              How it works


      The following is a brief description of how Producer works inter-
 nally.   This  was written from my recollection of how I left the code
 over a year ago. It may be inaccurate in places.

      Producer is basically a compiler. It's lexical analyzer  (written
 in  lex) divides Smalltalk-80 text into lexemes, and its parser (writ-
 ten in yacc) recognizes  valid  lexeme  sequences  and  constructs  an
 abstract  representation  of  the  program  as an expression tree. The
 expression tree consists of instances  of  Objective-C  classes;  e.g.
 Method,  Statement, Expression, Message, and Variable. The grammar was
 derived from the syntax diagrams in Goldberg and Robson; _�S_�m_�a_�l_�l_�t_�a_�l_�k-_�8_�0:
 _�T_�h_�e _�L_�a_�n_�g_�u_�a_�g_�e _�a_�n_�d _�i_�t_�s _�I_�m_�p_�l_�e_�m_�e_�n_�t_�a_�t_�i_�o_�n; Addison Wesley; 1986.

      The grammar was extended to also recognize rules  that  may  also
 appear  in the lexeme stream. Rules are enclosed in { braces } to help
 fend off shift-reduce conflicts from yacc. The parser stores the rules
 in separate data structures for use during code generation.

      At certain points, the parser sends the  top  of  the  expression
 tree  a  gen  message  to  trigger  code  generation[1].  Recall  that
 Smalltalk-80 is an extremely simple language with basically  two  com-
 ponents;  data  references  (variables,  literals,  etc) and messages.
 Rules may influence how each case is treated during code generation.

      Code generation proceeds in two passes. The first  pass  collects
 typing  information  for  each  symbol  and  message  by examining the
 expression tree from the bottom up. The bottom-most nodes  are  either
 literals whose type is immediately obvious (e.g. 1, 2.3, or 'string'),
 or they are symbols whose type can be known or unknown.  Symbol  types
 ____________________

9 [1] I now regard this as a major architectural flaw whenever I see it in any application. It represents a key departure from an important but often ignored rule of object-oriented design. The expression tree classes should be abstract so that they could be reused in other tools. But their code generation methods pollute the abstraction with knowledge about a particular concrete interface; Objective-C. The code generation methods should have been provided in a separate hierarchy of classes that know how to connect the abstract classes to one of many potential concrete interfaces. This rule is simply a generaliza- tion of the model/view/controller paradigm to apply to interfaces of any kind, not just user interfaces.

9 Brad Cox 2 June 22, 1987

            Producer: Smalltalk-80 to Objective-C Translator


 become known either as the  result  of  a  previous  type  inferencing
 operation  or because their type was specified in a rule. Unknown sym-
 bols default to id when first referenced.

      Most of the internal  nodes  are  messages.   Message  typing  is
 slightly more complicated because any message can have multiple trans-
 lations depending on how the message is used because  different  rules
 may  specify  different  translations for different receiver and argu-
 ments types. The diverse translations may  each  compute  a  different
 type.  Since  we  assign types bottom up, types have been assigned for
 the arguments and the receiver, so a translation for that selector  is
 chosen  by searching a table of possible translations for one matching
 the receiver and argument types.

      In all cases, unless  overridden  by  a  specific  rule,  default
 translations  are  used.  These amount to a fairly literal translation
 from Smalltalk-80 syntax to Objective-C syntax. However exceptions are
 made  for  Smalltalk  literal  constants, which translate to C literal
 constants. In other words, 2+2 translates  to  [2  plus:2],  which  is
 _�g_�u_�a_�r_�a_�n_�t_�e_�e_�d  to  fail catastrophically in Objective-C. The integer 2 is
 an object only in Smalltalk!

      The moral:  _�N_�e_�v_�e_�r  believe  the  translator.  _�A_�l_�w_�a_�y_�s  monitor  it
 closely. Remember the 90-10 rule. The automatic translation concept is
 capable, with suitable rules, of automatically translating only 90% of
 an  application  correctly;  the  other  10% (where the bugs will have
 congregated) is still up to you.



                         Implementation Status


      Producer currently represents about three  man-weeks  of  effort,
 spent  in  two  intensive  bursts  separated by about a year. The most
 recent burst was nearly a year and a half ago.  The first burst was to
 demonstrate  the  feasibility and practicality of the translation con-
 cept. The second burst was in the course of preparing  a  paper  that,
 coauthored  with Kurt Schmucker, will appear in the OOPSLA-87 proceed-
 ings. A (very) early draft is provided with this distribution.

      For being developed so quickly, the translator does an  effective
 job  of  translation.  I refer you to the paper for discussions of the
 strengths and limitations of the translation  concept.   This  section
 discusses  the current implementation of this concept, the items on my
 own must-do list for the planned, but not yet completed,  third  stage
 of Producer's evolution.

 (1)  Smalltalk-80 fileout format uses '!' delimiters in a fashion that
      I  was never able to formalize correctly in Producer's yacc gram-
      mar. The symptom is that  the  translator  will  generate  syntax
      errors  in nearly every translated file for certain of these del-
      imiters. I'm told that fileout format has been  documented  in  a


 Brad Cox                          3                      June 22, 1987







            Producer: Smalltalk-80 to Objective-C Translator


      paper  somewhere, but I've never worked the repairs back into the
      code. The fix should be local to gram.y.

 (2)  The translator loads its rule base by reading files of  rules  as
      if  they were concatenated with the sources to be translated. The
      rule-specification syntax is abysmal, primarily  because  it  was
      chosen  to  minimize  the  amount of time I spent struggling with
      shift-reduce conflicts from yacc, rather than  making  the  rules
      intelligible  to  users. Smalltalk's formal grammar seemed unrea-
      sonably difficult for yacc to swallow, and I suspect the  problem
      may  lie  in  some  mistake I've made in translating Smalltalk-80
      syntax diagrams into yacc specifications.

 (3)  The program contains extensive provisions for reporting its cogi-
      tations in type inferencing. The various error, warning, logging,
      and debugging messages need to be tuned for greater utility.

 (4)  The code was based on an as yet unreleased libary (phylum) called
      "Substrate",  which  supports  features  that  are not yet in our
      standard product set, like  Blocks,  Coroutining,  and  exception
      handling.   I made a fast editing pass to remove any dependencies
      on these nonstandard library  features.  I  also  added  a  file,
      Substrate.h,  that defines stylistic conventions that I adhere to
      in all my work. See USE, IMPORT, EXPORT, etc in the sources.

      The preceeding problems are superficial and easily repaired.  The
 following  ones  are  somewhat  more  substantial in that they involve
 design work in addition to coding work.

 (1)  The type inferencing machinery infers types of  newly-encountered
      (unknown)  messages and variables by seeing how they are combined
      with variables and messages whose types are known apriori or else
      determined  earlier through inferencing.  The only types that are
      known apriori are literals like 1, 2.3, or  'string'.  This  gen-
      erally  provides  insufficient  typing  information from which to
      infer anything useful, so you should generally  provide  variable
      rules  to  pin  down  types for key instance variables and method
      arguments You do this with rules that state, in effect, that `the
      type of the Smalltalk variable named foo is int, and the variable
      is called foobar in Objective-C'.  Presently  rules  have  global
      scope.  If different Smalltalk classes use the name, foo, in ways
      that should be translated differently, different rule  sets  must
      be  provided  manually  to  the translator. Creating and managing
      these application-specific rules sets  adds  to  the  translation
      effort  and tends to make rules non-reusable across translations.
      The rules should be organized with a scoping  mechanism,  ideally
      one based on inheritance.

 (2)  The inferencing logic is ad-hoc and quite possibly slow.  However
      the  main  bottleneck seems to be loading the rule-base; transla-
      tion  speed  has  never  been  a  real  problem.  Inferencing  is
      presently  deductive,  and  a more inductive scheme based on both
      forwards and backwards reasoning  might  produce  higher  quality


 Brad Cox                          4                      June 22, 1987







            Producer: Smalltalk-80 to Objective-C Translator


      translations.  In other words, the translation of a given message
      expression is determined exclusively by whatever information  can
      be inferred about the types of the receiver and arguments to that
      message (forward reasoning). Backward reasoning would  also  con-
      sider how the results of the expression are used in other expres-
      sions.

 (3)  Producer does not presently handle  non-trivial  uses  of  Blocks
      correctly;  ie.   Block  expressions  that  cannot  be translated
      directly into C conditional expressions like if, while,  or  for,
      which Producer handles just fine already.  Nearly all occurrences
      of Smalltalk-80 Blocks could  be  handled  without  changing  the
      Objective-C  language by adding a trivially simple Block class to
      the library. A named instance  variable  holds  a  pointer  to  a
      static function and indexed instance variables hold _�c_�o_�p_�i_�e_�s _�o_�f any
      variables that the block accesses in the  instantiation  site[2].
      This  copy  could  be taken entirely automatically by copying the
      instantiation site's stack frame.  However I prefer to have  more
      control  over  space  than that. So I've been using a scheme that
      requires the programmer (and someday  the  compiler)  to  specify
      which  variables are really accessed by the block as arguments to
      the message that instantiates the block; like this

           ... {
            IMPORT void aStaticFunction();
              id var1 = something, var2 = something;
            aBlock = [Block function:aStaticFunction args:2, var1, var2];
            [anyObject do:aBlock];
            ...
           }
           LOCAL void aStaticFunction(instantiationSiteVariables, value1, value2)
            struct { id var1, var2; } *instantiationSiteVariables;
            id value1, value2;
           {
            if ([instantiationSiteVariables->var1 someMessage])
              ...
           }


      The block will call the function when anyObject sends  the  block
      one  of  several  evaluation  messages  (value:arg1 or value:arg1
      value:arg2 or ...). The first argument is a  _�p_�o_�i_�n_�t_�e_�r  to  block's
      copy  of  the  instantiation site's variables. The trailing argu-
      ments contain the arguments that the invocation  site  passed  in
      the value: message.  I've used this approach extensively by writ-
      ing the static functions by hand, and am trying to get our  staff
      to  extend  the  language  to provide some kind of language-level
      support to make the syntax simpler.  This approach could be,  but
      has not yet been, taken by Producer.
 ____________________

9 [2] In Smalltalk-80, the block seems to have access to the instan- tiation site's variables, so that the block can change variables in

9 Brad Cox 5 June 22, 1987

            Producer: Smalltalk-80 to Objective-C Translator


      The inferencing machinery's primary current virtue is that it can
 be made to work for selected test cases. It leaves lots to be desired.
 Call me if you decide to extend it so that I can  prevent  unnecessary
 duplication of effort.



                         About the distribution


      The top level of the distribution consists of

     total 88
     -rw-r--r--  1 cox           181 Jun 22 14:32 Makefile
     -rw-r--r--  1 cox         26592 Jun 22 14:30 README
     drwxr-xr-x  2 cox           512 Jun 22 14:19 example
     -rw-r--r--  1 cox           166 Jun 16 13:18 log
     -rw-r--r--  1 cox           997 Jun 15 11:09 mac.me
     -rw-r--r--  1 cox         26751 Jun 15 11:02 producer.me
     -rw-r--r--  1 cox         21444 Jun 22 14:29 readme.me
     drwxr-xr-x  2 cox           512 Jun 12 10:22 rules
     drwxr-xr-x  2 cox          3072 Jun 22 14:31 src

 The Makefile governs formatting of  the  two  documents;  this  README
 (from   readme.me)   and  the  draft  of  the  OOPSLA-87  paper  (from
 Producer.me). The mac.me file contains text formatting macros that are
 common to both papers; used like this:

     nroff -me mac.me Producer.me >Producer.f


      The rules directory contains  a  single  file,  generic.ru,  that
 represents a first pass at an application-independent rules base. This
 set of rules translate Smalltalk to the conventions used in my  proto-
 type version of the user interface library.

      For example, it translates Smalltalk Integer operations to C  int
 operations,  and  it translates Smalltalk Point operations to C macros
 that manage points as type PT; a pair of 16-bit coordinates in  a  32-
 bit  C  int.   For  example,  pt(x,y) invokes a C macro that trims and
 shifts two ints, x and y, to fit side by side  in  a  32-bit  integer,
 ptPlus(p,q)  invokes  a  macro  that  computes  the  vector sum of two
 points, p and q, etc.

     rules:
     total 35
 ____________________

9 the instantiation site. In Objective-C the block receives a copy of the variables and cannot use them to communicate with the instantia- tion site. I believe that this is the sole functional difference between the two schemes.

9 Brad Cox 6 June 22, 1987

            Producer: Smalltalk-80 to Objective-C Translator


     -rw-r--r--  1 cox         35567 Jun 12 10:22 generic.ru


      The src directory contains a fragment from  the  video  animation
 program  that  appears  at  the  end  of  the Smalltalk-80 video tape.
 BounceInBoxNode.st is the Smalltalk-80 source file, animation.ru  con-
 tains  the  application-specific  rule  set,  BounceInBoxNode.m is the
 translated version built by Producer as invoked by Makefile[3].

     example:
     total 7
     -rw-r--r--  1 cox          1730 Jun 16 10:24 BounceInBoxNode.m
     -rw-r--r--  1 cox           868 Jun 16 10:18 BounceInBoxNode.st
     -rw-r--r--  1 cox           394 Jun 16 10:20 Makefile
     -rw-r--r--  1 cox          2178 Jun 16 10:18 animation.ru
     -rw-r--r--  1 cox           185 Jun 16 10:24 log
     -rw-r--r--  1 cox           239 Jun 16 10:18 st80.h


      The log file records the results of the translation session.  The
 syntax  error  is innocuous, the result of the beforementioned problem
 in the grammar in handling '!' delimiters.

     Producer -c ../rules/generic.ru animation.ru BounceInBoxNode.st >BounceInBoxNode.m
     error 7:BounceInBoxNode.st: tegory:'Graphics-Animation'!! : syntax error
     *** Error code 1 (ignored)


      The src directory contains the sources for Producer, with its own
 Makefile.    The  Substrate.h  header  file,  which  is  automatically
 included by the Producer.h header file, is technically  a  part  of  a
 internal  lower level library, Substrate, on which Producer was origi-
 nally developed. Substrate.h was copied and changed  superficially  so
 that Producer compiles correctly without the Substrate library.

     src:
     total 70
     -rw-r--r--  1 cox           483 Jun 12 10:21 AbstractTranslation.m
     -rw-r--r--  1 cox           282 Jun 12 10:21 ArgumentList.m
     -rw-r--r--  1 cox           897 Jun 12 10:21 Block.m
     -rw-r--r--  1 cox           143 Jun 12 10:21 CharConstant.m
     -rw-r--r--  1 cox          2205 Jun 12 10:21 Class.m
     -rw-r--r--  1 cox           630 Jun 12 10:21 Comment.m
     -rw-r--r--  1 cox           176 Jun 12 10:21 Constant.m
     -rw-r--r--  1 cox          2032 Jun 12 10:21 Expr.m
     -rw-r--r--  1 cox          1243 Jun 12 10:21 FunctionTranslation.m
     -rw-r--r--  1 cox          1484 Jun 12 10:21 Identifier.m
     -rw-r--r--  1 cox          1248 Jun 12 10:21 IdentifierTranslation.m
 ____________________

9 [3] The full source for the animation program is not provided. My copyright paranoia argued against providing even this fragment.

9 Brad Cox 7 June 22, 1987

            Producer: Smalltalk-80 to Objective-C Translator


     -rw-r--r--  1 cox           105 Jun 12 10:21 List.m
     -rw-r--r--  1 cox          1985 Jun 15 11:55 METHODDECLS.m
     -rw-r--r--  1 cox          1384 Jun 15 11:51 Makefile
     -rw-r--r--  1 cox          4302 Jun 12 10:21 Method.m
     -rw-r--r--  1 cox          3136 Jun 12 10:21 Msg.m
     -rw-r--r--  1 cox           583 Jun 12 10:21 MsgArgPattern.m
     -rw-r--r--  1 cox           828 Jun 12 10:21 MsgNamePattern.m
     -rw-r--r--  1 cox          1280 Jun 12 10:21 MsgTranslation.m
     -rw-r--r--  1 cox           775 Jun 12 10:21 MsgTranslator.m
     -rw-r--r--  1 cox          1868 Jun 12 10:21 Node.m
     -rw-r--r--  1 cox           229 Jun 12 10:21 NumberConstant.m
     -rw-r--r--  1 cox          1402 Jun 15 11:27 Producer.h
     -rw-r--r--  1 cox           306 Jun 12 10:21 Return.m
     -rw-r--r--  1 cox           825 Jun 12 10:21 Scope.m
     -rw-r--r--  1 cox          3157 Jun 12 10:21 Selector.m
     -rw-r--r--  1 cox           253 Jun 12 10:21 SelectorConstant.m
     -rw-r--r--  1 cox           457 Jun 12 10:21 StArray.m
     -rw-r--r--  1 cox           492 Jun 12 10:21 Stmt.m
     -rw-r--r--  1 cox           381 Jun 12 10:21 StringConstant.m
     -rw-r--r--  1 cox          1268 Jun 12 10:21 StringTranslation.m
     -rw-r--r--  1 cox          2140 Jun 15 11:38 Substrate.h
     -rw-r--r--  1 cox          1405 Jun 15 11:53 Symbol.m
     -rw-r--r--  1 cox           452 Jun 12 10:21 Template.m
     -rw-r--r--  1 cox           901 Jun 12 10:21 Type.m
     -rw-r--r--  1 cox          1800 Jun 12 10:21 design.me
     -rw-r--r--  1 cox          3271 Jun 12 10:21 gen.m
     -rw-r--r--  1 cox          9007 Jun 12 10:21 gram.y
     -rw-r--r--  1 cox          3601 Jun 12 10:21 lex.l
     -rw-r--r--  1 cox          2212 Jun 12 10:21 main.m
     -rw-r--r--  1 cox           260 Jun 12 10:21 st80.h
     -rw-r--r--  1 cox           259 Jun 15 11:59 y.tab.h


      The files are exactly as I left them nearly a  year  and  a  half
 ago, except for:

 (1)  The addition of this README  document.  An  early  draft  of  the
      OOPSLA-87 paper, sadly prior to Kurt Schmucker's improvements, is
      in Producer.me.

 (2)  One recompilation pass to remove any obvious dependencies  on  my
      private  Substrate  library  and to verify that Producer compiles
      and runs correctly on the standard Foundation library.  I  tested
      the  changes  by verifing that the Makefile in the example direc-
      tory ran to completion, but this is hardly an ironclad guarantee.



                               Using Producer


      Flags controlling the  translation  process,  source  files,  and
 rules  files are provided on the command line and are processed in the


 Brad Cox                          8                      June 22, 1987







            Producer: Smalltalk-80 to Objective-C Translator


 order they appear.  The flags are[4]

 -d:  Enable debugging functions (dbg()) scattered throughout the code.
      Seldom useful.

 -m:  Enables  the  Objective-C  Foundation  library  message   tracing
      feature. Seldom useful in Producer.

 -a:  Enables the Objective-C  Foundation  library  allocation  tracing
      feature. Seldom useful in Producer.

 -l:  Enables printing of each lexical token as produced by lex. Useful
      only for debugging lex.l.

 -g:  Enables automatic redirection of each class into a separate  file
      based on the class name parsed from the input file. Automatically
      puts class Foobar into file Foobar.m.

          CAREFUL! This puts at risk other files whose  name  might
          coincide with a Smalltalk-80 class name!


 -s:  Generate Smalltalk-80 sources in the output file  as  Objective-C
      comments (the default).

 -c:  Don't generate Smalltalk-80 sources in the output file.

 -i:  Generate information that was thought at one time  to  be  useful
      when debugging rules.

 -M:  Send storeOn: to the message rule  dictionary  just  before  ter-
      minating as a debugging aid.

 -I:  Send storeOn: to the variable rule dictionary  just  before  ter-
      minating as a debugging aid.

      Typically, the generic rules  in  rules/generic.ru  is  specified
 first, then any application-specific rules, then a single Smalltalk-80
 source file.  Unless -g is  set,  the  translated  output  appears  on
 stdout.  The  various  creaks, groans and mumbles that can be elicited
 about the translation process itself appear on stderr.

      For the syntax for writing new rules, refer to  the  examples  in
 generic.ru  and  animation.ru,  and if necessary, the rules section of
 the grammar in gram.y.

      And good luck! Let me know how you fare...

 ____________________

9 [4] I'm working from memory about what these flags mean. Some may be nonfunctional:

9 Brad Cox 9 June 22, 1987