/edbrowse

A command-line editor and web browser.

Primary LanguageCOtherNOASSERTION

edbrowse, a line oriented editor browser.
Maintained by Chris Brannon, chris@the-brannons.com
Written and previously maintained by Karl Dahlke, eklhad@gmail.com
See COPYING for licensing agreements.

------------------------------------------------------------

Netscape and Explorer are graphical browsers.
Lynx and Links are screen browsers.
This is a command line browser; the only one of its kind.

The user's guide can be found in doc/usersguide.html (in this package).
Of course this reasoning is a bit circular.
You need to use a browser to read the documentation,
which describes how to use the browser.
Well you can always do this:

cd doc ; lynx -dump usersguide.html >usersguide.txt

This produces the documentation in text form,
which you can read using your favorite editor.
Of course we hope edbrowse will eventually become your
favorite editor, whence you can browse the documentation directly.

The doc directory also includes a sample config file, and a script to help you
set up your own, customized config file.
Run setup.ebrc and answer the questions.
(This assumes you are using bash.)

------------------------------------------------------------

Ok, I'm going to assume you've read the documentation.
No need to repeat all that here.
You're here because you want to compile the program, or modify the source
in some way.  Great!

I haven't developed a configure script, and I really need to do that,
but I just don't have the time.
So I'm hoping this entire package will build on any Unix platform as is.
That's wildly optimistic, but there it is.
(I haven't heard too many problems in this regard.)

As you may know, edbrowse was originally a perl script.
(Perl is a fantastic prototyping language, but not always the best choice
for serious software.)
As such, it was only natural to use perl regular expressions for
my search/substitute functions in the editor.
And once you've experienced the power of perl regexp, you'll never
go back to ed.  So I use the perl compatible regular expression
library, /lib/libpcre.so.0, available on most Linux systems.
If you don't have this file, check your installation disks.
the pcre and pcre-devel packages might be there, just not installed.
If you can't find them there, you can get the pcre source here.

ftp://ftp.sourceforge.net/pub/sourceforge/pcre/

Note that my files include <pcre.h>.
Some distributions put it in /usr/include/pcre/pcre.h,
so you'll have to adjust the source, the -I path, or make a link.
Yeah I know, that's something a configure script should do for you,
but I'm just not there yet.

We assume you have the ssl libraries installed, for secure connections.
You just can't buy anything on the Net without secure connections,
so I make no effort to "work around" these libraries.
You just have to have them, period.
If you don't have /lib/libssl.so.2, you'll need to install the library,
or build the package from source.

http://www.openssl.org/source/

You also need libcurl, which is included in almost every Linux distro.
This is used for ftp, http, and https.
Check for /usr/include/curl/curl.h
Edbrowse requires version 7.17.0 (or later).  If you compiled with a version
prior to 7.17.0, the program will inform you that you need to upgrade.

Finally, you need the Spider Monkey javascript engine from Mozilla.org
(I got smart and decided not to reinvent all this js machinery.)
ftp://ftp.mozilla.org/pub/mozilla.org/js/
As of edbrowse version 3.4.8,
we now require version 1.8.5 -- or higher -- of Spidermonkey.
Donwload a package and expand it in /usr/local.
It will create a directory js.
Go into js/src and run
gmake -f Makefile.ref
This should build everything you need.
cd Linux_All_DBG.OBJ
ln -s `pwd`/libjs.so /usr/lib/libsmjs.so

If you want database access, you need unixODBC and unixODBC-devel.
Make edbrowseodbc, rather than edbrowse.
You may want to run make clean first,
because edbrowseodbc has some conditional compilation that won't happen
if you run make edbrowse and make edbrowseodbc sequentially.

Now return to your edbrowse src directory.
To build edbrowse - guess what - you type make.
(Mac users type make -f makefile.osx.)
Then you send me mail when it doesn't work.  :-)
Not a bad idea to read through the makefile before running make.
It's pretty simple.

If you're a developer, and you want to use the symbolic debugger,
set EBDEBUG=on before you run make.

If it compiles,
there is no guarantee that this program will perform as expected.
It might trash your precious files.
It might send bad data across the Internet,
causing you to buy a $37,000 elephant instead of
$37 worth of printer supplies.
No guarantees whatsoever.
Use this program at your own risk.
There - now the lawyers are happy.

------------------------------------------------------------

Blind people could care less about indenting their code.
In fact we would rather not.  It's a real nuisance!
But in deference to my sighted colleagues, I have run all
the source through indent(1), using the settings in tools/indent.pro.
Braces are in the style of the linux kernel, which I like.
We don't waste time (especially on a braile display)
consuming an extra line just for the left brace.
If you modify this code, please use the ebindent script, found in the tools
directory of the edbrowse source tree, to insure that modified files still
adhere to this style.
For instance, if you change src/main.c, just run
../tools/ebindent main.c # Assuming that the working directory is src
Thank you for your understanding.

------------------------------------------------------------

Debug levels:
0: silent
1: show the sizes of files and web pages as they are read and written
2: show the url as you call up a web page,
   and javascript execution errors
3: show each url through redirection,
   cookies, http codes, form data, and sql statements logged.
4: show the socket connections, and the http headers in and out
5: url resolution
6: free and alloc text lines, show javascript
7: reformatted regular expressions, breakline chunks.
8: not used
9: not used

------------------------------------------------------------

Sourcefiles (in the src directory).

main.c:
Read and parse the config file.
Entry point.
Command line options.
Invoke mail client if mail options are present.
If run as editor/browser, treat arguments as files or URLs
and read them into buffers.
Read commands from stdin and invoke them via the command
interpreter in buffers.c.

buffers.c:
Manage all the text buffers.
Interpret the standard ed commands, move, copy, delete, substitute, etc.

stringfile.c:
Helper functions to manage memory, strings, files, directories.
Hides many of the OS specific quirks.

url.c:
Split a url into its components.
Decide if it's a proxy url.
Watch for infinite loops during url redirection.
Resolve relative url into absolute url
based on the location of the current web page.

tcp.c:
Place a wrapper around the tcp calls.
Hides the differences between Unix and Windows.
Yes, I want to port this program to Windows.
Can you help??

format.c:
Parse html tags and comments.
Translate common unicode sequences.
Arrange text into lines and paragraphs.

http.c:
Send the http request, and read the data from the web server.
Handles https connections as well,
and 301/302 redirection.
Uncompresses html data if necessary.

html.c:
Render the html tags and format the text.
Update input fields.
Submit/reset forms.

cookies.c:
Send and receive cookies.  Maintain the cookie jar.

auth.c:
Remember user and password for web pages that require authentication.
Only the basic method is implemented at this time.

sendmail.c:
Send mail (smtp).  Encode attachments.

fetchmail.c:
Fetch mail (pop3).  Decode attachments.
Browse mail files, separate mime components.

messages.h:
Symbolic constants for the warning/error messages of edbrowse.

messages.c:
Strings corresponding to these error messages,
in various languages.
Also some international print routines to display the correct string,
according to your locale.

jsdom.c:
Document object model under javascript.
Build objects for the hyperlinks, forms, elements, etc.
Includes basic methods like alert() prompt() window.open() etc.

jsloc.c:
The location object, and other objects (like document.cookie)
with strange side effects.

jsrt:
This is the javascript regression test for edbrowse.
It exercises some of the javascript DOM interactions.

dbops.c:
Database operations; insert update delete.

dbodbc.c:
Connect edbrowse to odbc.

dbinfx.ec:
Connect edbrowse directly to Informix.

------------------------------------------------------------

Error conventions.
Unix commands return 0 for ok, and a negative number for a problem.
I return true for success and false for error.
The error message is in a buffer, which you can see by typing h,
in the /bin/ed fashion.
Sometimes the error is displayed no matter what,
like when you are reading or writing files.
error messages are created according to your locale, i.e. in your language,
if a translation is available.
Some error messages in the database world have not yet been internationalized.
Some are beyond my control, as they come from odbc or its native driver.

------------------------------------------------------------

Multiple Representations.

A web form asks for your name, and you type in Fred Flintstone.
This piece of data is part of your edbrowse buffer.
In this sense it is merely text.
You can make corrections with the substitute command,
use the undo command to back up, etc.
Yet it is also carried into the html tags in html.c,
so that it can be sent when you push the submit button.
This is a second copy of the data.
As if that weren't bad enough, I now need a third copy for javascript.
When js accesses form.fullname, it needs to find,
and in bizarre cases change, the text that you entered.
I believe the first representation, in your editor, must be separate
from the second and third (which I have managed to merge together).
Remember that an input field could be an antire text area,
i.e. the text in another editing session.
When you are in that session, composing your thoughts,
am I really going to take every textual change, every substitute,
every delete, every insert, every undo,
and map those changes over to the html tag that goes with this session,
and the java variable that goes with this session?
I don't think so!
So I'm stuck with something almost as bad.
As part of the browse command, the field data, whether set up by html
or modified by javascript, must be copied into your editor text.
Then you can change it to your heart's content, but when you submit the form,
or run any javascript, for any reason, all that text has to be carried
back to the javascript world.
This is accomplished by jSyncup() in html.c.
When javascript is done, any changes it has made to the fields have
to be mapped back to the editor, where you can see them.
This is done by javaSetsTagVar() in html.c.

You should really receive some alerts if fields
have been changed out from under you.
A sighted person would see the screen change, but you can't.
You know that field says Fred Flintstone, that's what you typed
in, but javascript has changed it to Barney Rubble, and you need
to be notified.
Edbrowse does this for you.

------------------------------------------------------------

To look at all this source, some 30,000 lines,
you wouldn't know I was a professional programmer with 25 years experience.
I mean it really looks hacked together.
Well - I wrote it in my spare time, because I needed this tool desperately.
No careful methodology, just "Gitter Duhn".

------------------------------------------------------------

Edbrowse really needs a complete rewrite.
Today, it takes html text and tags, and builds
javascript objects to go along with those tags; and it builds the screen,
(I'll call it a screen, it's really a text buffer),
in parallel, at the same time.
But what it should do is not build the screen at all on the first pass.
Text would simply be turned into a string
and put in as an attribute in whatever object we are working on at the time,
or the document object if no tag has appeard.
All internal.
Then you run the javascript, and it creates more objects, perhaps,
creates lists dynamically,
builds entirely new paragraphs, images, links, whatever.
Then pass 3, render the screen by traversing the tree of objects
depth first.
Now, if anything changes, through some onclick code,
you have a new tree of objects,
so Simply rerender the screen.
Maybe this is what standard browsers do; I don't know.
But there's something we should do that they don't.
Keep a copy of the old screen and run diff.
If one or two lines have changed,
or if a menu has changed behind the scenes, report that to the user.
It's something a sighted person would notice right away,
but I would be left in the dark.
If there are many changes, just say "entire screen has changed".
Anyways, this is the kind of rewrite that needs to happen.
But I *way* don't have the time.
And even though I can see the design in my head, I'm not sure I have the skills.
Maybe I underestimate myself, but it looks really complicated.

We also have to ask if this would be a good time to switch from C to C++.
There are pros and cons here, the major con being that I'd have to learn C++,
something I've been avoiding for 25 years.    :-)
But all that code dedicated to growing strings dynamically,
and freeing them without memory leaks, etc, etc, has become a royal pain.
The C++ string library would solve most of this.

I just wish someone would pay me to do all this stuff.
An NSF grant or something.
God knows they fund sillier things.
But I have no idea how to win a grant.
So in lieu of that, I just don't have the time that
would be required to take some of the next big steps.