AlexBuz/llama-zip

Interactive mode generate text instead of compressing when using specific text

secemp9 opened this issue · 2 comments

Here, I managed to reproduce this consistently using two text string:

  1. "test"
  2. "testme"
    3."4bd84c0c58be5204c715c82ce55b06a5beb66a8583312699980d892de44170cdc4ccca7e5db8873d2dd341fdd95fddcd16db9d0405b368984a7c7ed8584c3c006983bcb4505c3782c6da3729b4aaf1029adf892825b007f59db89cd9b7cba6ad18ab24f2"

The third one is just random bytes (100 in length) generated from /dev/urandom, that I took the hex format of, since it does not work on raw bytes (utf8 error, but that's another issue that I'll post later)

Here is the output of all three in interactive mode:

  1. Question:
    Suppose -2r - 3719 = 3r + i, 5 = -5*i. Let n = r - -392. Is 3/8 greater than n?
    Answer:
    True

  2. Question:
    Suppose -2r - 417 + 1106 = 5j, 0 = -5*j + 25. List the prime factors of r.
    Answer:
    3, 17

  3. North Carolina's 64th Judicial District consists of the counties of Caswell, Person and Rockingham.
    The District Attorney is responsible for prosecuting all criminal matters and child abuse and neglect matters in the District Courts. The DA's Office has a staff of over sixty employees, including the District Attorney and two Assistant District Attorneys.
    The District Attorney is accountable to the District Attorney's Office, located in the Person County Courthouse Building. The address of the District Attorney's Office is 125 W. Palmer Street, Roxboro, NC 27573-2246.
    District Court sessions are held in each of the three counties of our judicial district.
    Court Dates ##### Meetings held ##### Building ##### Room ##### Times
    Caswell Co. Caswell Co. Clerk's Meeting Room Caswell Co. Courthouse
    Person Co. Person Co. Clerk's Meeting Room Person Co. Courthouse
    Rockingham Co. Rockingham Co. Clerk's Meeting Room Rockingham Co. Courthouse
    If you would like to submit a compliment to a member of the District Attorney's staff, please click on this link to fill out a compliment form . It will then be emailed to the office at the completion of the form.
    If you would like to submit a complaint about a jury you have served on, a judge that has presided over your case, or a prosecutor who has handled your case, please click on this to link to fill out a complaint form . This form will also be emailed to the office at the completion of the form.

Can be seen on the saved version of the colab from the readme: https://github.com/secemp9/test1/blob/main/Copy_of_llama_zip.ipynb

Check issue #2

Indeed, this behavior is expected as a base64 input like "test" could conceivably have been the result of compression. To work around this, you can either use compression mode directly or, alternatively, you can add a non-base64 character to your text, such as a space at the end, if you don't mind it being compressed with the rest of your text.