tools4j/unix4j

Unix4j sed has nothing in common with Unix sed

Closed this issue · 1 comments

I believe Unix4j is promising too much for "sed"
Implementation of Unix command line tools in Java. You can use the commands that you know from Unix in a Java program---you can pipe the results of one command to another as you know it from Unix.

The standard Unix "sed" options are listed in PS. Unix4j is implementing:
-n

and other Unix4j sed options:
-g
-p
-l (misleading, every sed use it as line-length parameter)
-I
-s
-a
-i (misleading, every sed use it as in-place replacement)
-c
-d
-y

have nothing to do with an Unix sed. Or am I issing something? Could you please document, how to achieve a simple Unix sed search/in-place-replacement option "-i" in a file with Unix4j :

  1. cat test.txt
    Unix4j is not such a great tool as I thought.

  2. sed -i 's/is not/is/' test.txt

  3. cat test.txt
    Unix4j is such a great tool as I thought.

or let me know if you have any plans to add an Unix "-i" sed option into unix4j package soon?

I you have no such plans, please list only those unix4j commands on your project page, which are implemented as a Unix command (like unix4j.grep).

Thanks
siarsky

PS:

$ sed --help
Usage: sed [OPTION]... {script-only-if-no-other-script} [input-file]...
-n, --quiet, --silent suppress automatic printing of pattern space
-e script, --expression=script add the script to the commands to be executed
-f script-file, --file=script-file add the contents of script-file to the commands to be executed
--follow-symlinks follow symlinks when processing in place
-i[SUFFIX], --in-place[=SUFFIX] edit files in place (makes backup if SUFFIX supplied)
-b, --binary open files in binary mode (CR+LFs are not processed specially)
-l N, --line-length=N specify the desired line-wrap length for the `l' command
--posix disable all GNU extensions.
-E, -r, --regexp-extended use extended regular expressions in the script (for portability use POSIX -E).
-s, --separate consider files as separate rather than as a single, continuous long stream.
--sandbox operate in sandbox mode.
-u, --unbuffered load minimal amounts of data from the input files and flush the output buffers more often
-z, --null-data
separate lines by NUL characters
--help display this help and exit
--version output version information and exit

Hi Siarsky,

Thanks for your question and your feedback.

Firstly, yes there will be some difference to unix commands as you know them. We tried to stick closely to a very minimal definition of the commands, but even there we have some deviation because unix4j uses some different concepts.

Firstly, we used the following command doc as a basis for our commands:
https://pubs.opengroup.org/onlinepubs/009695399/utilities/sed.html

Note that some of the letters above are referred to as functions in the sed documentation, but implemented as options in unix4j.

In-place replacement is not currently directly supported by unix4j as this is a bit tricky to implement and typically requires random-access to files and/or in memory buffering of parts of the file, which unix4j does not implement (not sure how it is implemented internally in unix variants, they may also create a new temp file first and rename it at the end).

As a final note, please feel free to use unix4j as is if you like it (as many others do btw). If you do not like it then you have basically 3 options: (1) provide a pull request with fixes/enhancements/improvements (we are always happy to review and consider pull requests), (2) write your own tool (e.g. port a unix c source to java if you like), or (3) simply don't use it and look for another tool or implementation.

Thanks for your honest feedback.
Regards, Marco