wolph/python-progressbar

Multiple progress bars appear at once in IntelliJ IDEA

HatScripts opened this issue · 25 comments

Description

This obviously doesn't seem to be intended behaviour, but I'm not sure whether it's a problem with progressbar, with IntelliJ, or with my configuration.

image

Here's the full console output from running examples.py.

Code

Really any code that outputs a progress bar to the console, e.g.

import time
import progressbar

with progressbar.ProgressBar(max_value=10) as progress:
    for i in range(10):
        time.sleep(0.1)
        progress.update(i)

Versions

  • Python version: 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 07:18:10) [MSC v.1900 32 bit (Intel)]
  • Python distribution/environment: IDLE
  • Operating System: Windows 10
  • Package version: 3.18.1

Edit:
I just tested the following simple function and the issue still occurs. Guess that means it's probably an issue with IntelliJ IDEA.

import sys
from time import sleep

def update_progress(i, total, length=10, fg="#", bg=" ", decimals=0):
    progress = 100 * (i / float(total))
    blocks = int(length * i // total)
    bar = fg * blocks + bg * (length - blocks)
    sys.stderr.write(f"\r[{bar}] {progress:.{decimals}f}%")
    sys.stderr.flush()

for i in range(0, 100):
    update_progress(i, 99)
    sleep(0.1)
wolph commented

I've done a few tests and unfortunately it appears to be a bug in IntelliJ IDEA

The simplest code you can run to test is roughly this:

import sys
import time


print('\nUsing stderr:')
for i in range(100):
    time.sleep(0.1)
    sys.stderr.write('\r%05d' % i)
    sys.stderr.flush()

print('\nUsing stdout:')
for i in range(100):
    time.sleep(0.1)
    sys.stdout.write('\r%05d' % i)
    sys.stdout.flush()

Indeed, it does appear to be a bug in IntelliJ IDEA.

I found these issue reports pertaining to it on JetBrains' issue tracker:

If you're able to, please vote on them so that JetBrains can hopefully fix this.

wolph commented

I will, can't say I'm too hopeful about a quick fix unfortunately. My experience with the Jetbrains support is not too great. It generally takes years for them to fix issues.

wolph commented

To illustrate that point, the storing of ssh key passwords is broken (and some related features) for all of the Jetbrains suite and has been since 2009 or perhaps even sooner: https://youtrack.jetbrains.com/issue/IDEA-24944#tab=Linked%20Issues

That's a bug with "Major" priority and has been reported 8 separate times.

wolph commented

It's actually even more broken than I thought (and has been for years it seems):

image

The issue can be mitigated slightly be running the script in debug mode but it doesn't solve it

@wolph Wow, you're right. Just tested it with Java as well to confirm it's not somehow Python-specific, and got the same kinds of results, where stdout and stderr are interweaved.

Because I've only ever used IntelliJ as my IDE, I always thought this is how stdout and stderr were naturally supposed to behave (at least in Java). I thought the order they appear in the console wouldn't necessarily correlate to the order the print methods are called in, because it seems there is a thread which handles stdout and a thread which handles stderr.

However, when I compile and run the same program from the command line, I get the output of 0 to 19, in the correct order, for both Java and Python. In retrospect this makes a lot more sense and this clearly is a bug with IntelliJ.

But is this the same bug that causes printing \r to give unexpected results? You don't need to print to both streams to get weird results from \r:

import sys
import time

for i in range(100):
    time.sleep(0.1)
    sys.stdout.write('\r%05d' % i)
    sys.stdout.flush()
wolph commented

It's a different bug but it could be related. The Jetbrains console code is obviously quite broken and I'm not too hopeful that they'll fix it any time soon :(

I just found another bug report from 2 years ago: https://youtrack.jetbrains.com/issue/PY-17489

wolph commented

Since this is a bug in jetbrains which I can't fix... I'll have to close it.

I don't see this issue in PyCharm 2017.3.2 with:

with progressbar.ProgressBar(max_value=len(files)) as bar:
	for file in bar(files):
		# do stuff
		bar.update(bar.value)

Only oddity is the progress string is printed again on completion:

D:\dev\Python364\python.exe -m tools.update_json_with_text
Processing files...
100% (1397 of 1397) |#####################| Elapsed Time: 0:00:44 Time: 0:00:44
100% (1397 of 1397) |#####################| Elapsed Time: 0:00:44 Time: 0:00:44

Process finished with exit code 0
wolph commented

Since it's printed twice I'm guessing the issue still exists, but that's not to say the issue hasn't been improved.

I'll do some testing :)

wolph commented

It still looks quite broken to me, but they've decreased the update interval I think

PyCharm's Console is not a console. it is a terminal. and it will behave as such. If you are running PyCharm on Windows it will not work properly because PyCharm's console window does not respond to the same commands as the Windows Console window. This is something that just is. I do believe if all movements of the cursor are done using ANSI escape codes I do believe that it will work properly in PyCharm. the thing is how does one detect this in order to do the right thing.

wolph commented

The PyCharm terminal supports a subset of the ANSI escape codes, which is enough for this library to work.

The bug within PyCharm is with the timing between stderr and stdout which appears fully random causing interleaved results. If the library would print the output in the order it was sent to the streams (which is deterministic and consistent) it would work fine. But it appears to have 2 separately timed threads which prints character by character to the actual terminal regardless of the order it was receives.

ok so my question on this seeing as you just stated 2 different output mechanisms

does your package allow for outputting to stdout and stderr? and if the answer is yes do you have a thread lock in place so only one output buffer can be used at a time? I have not messed around with the ANSI end of things and the PyCharm console window. But i did do some testing. and we can detect if the script is running in a PyCharm console or not and make adjustments as needed. I have to figure out how to get the position and console window size if it is a PyCharm console.

wolph commented

The threading part is not even relevant here. Try this bit of code and you'll see: #115 (comment)

But yes, you could write to either stdout or stderr depending on your preference.
And your own stdout and stderr data can optionally be captured so it properly coincides with the progressbar without needing ANSI cursor manipulation.

I believe I found the solution..

in PyCharm click on the run dropdown
then click on the edit configurations

select the script on the left.
on the right check Emulate terminal in output console

ok maybe I didn't find the solution. I up the supported ansi codes and it would appear that PyCharm only supports colors and nothing else even tho their help files state that the console window supports all functions of the native OS .. LOL yeah right

that script worked just fine in PyCharm. not sure as to why. I do know that it does not support ansi escape codes for moving the cursor.

image

wolph commented

It seems they've decreased the update interval to such a point it often works. But this version is still very much broken:

import sys

for i in range(10):
    print(i, file=sys.stderr)

for i in range(10):
    print(i, file=sys.stdout)
wolph commented

For the purpose of this project though, it seems good enough. Especially with the terminal emulation enabled.

wolph commented

Here's a nice test:

import sys

for _ in range(10):
    for i in range(10):
        print(i, file=sys.stderr, end='')
        sys.stderr.flush()

    for i in range(10):
        print(i, file=sys.stdout, end='')
        sys.stdout.flush()

    print()

ok so I have discovered from that example that the problem is not necessarily with PyCharm.

I modified the script a wee bit.

import sys

for _ in range(10):
    for i in range(10):
        sys.stderr.write(str(i))
    sys.stderr.flush()
    for i in range(10):
        sys.stdout.write(str(i))
    sys.stdout.write('\n')
    sys.stdout.flush()

and the results are as they should be.

I noticed this in the past when trying to print output of some data before a traceback occurs. it's hit or miss of the output gets printed. the reason for this is 2 fold. the first thing is writing to stderr has a higher
priority then writing to stdout.

the second is that I believe that using the print statement has a whole lot more going on then writing directly to stdout. I think there is a buffer that stores the data and a thread (timer) gets started when data gets written to that buffer to flush the buffer. by using the code above bypasses that buffer and does what it is supposed to do. because of the flip flopping between using stdout and stderr with the print statement and the fact that any writes to stderr take a higher priority and by using the empty print (new line) with no flush after it causes that newline to be set on a timer as to when it will actually be sent to the console.

I think that PyCharm has it's issues but also how python handles writing data to stdout and stderr and how the use of the print() works adds another layer of complexity.

you can see what I am talking about if you then run this script

import sys

for _ in range(10):
    for i in range(10):
        sys.stderr.write(str(i))
    sys.stderr.flush()
    for i in range(10):
        sys.stdout.write(str(i))
        sys.stdout.flush()
    sys.stdout.write('\n')
    sys.stdout.flush()
wolph commented

I noticed this in the past when trying to print output of some data before a traceback occurs. it's hit or miss of the output gets printed. the reason for this is 2 fold. the first thing is writing to stderr has a higher
priority then writing to stdout.

That shouldn't be the case though. We're not dealing with threaded writes, the writes are done in a single thread which have a deterministic order. Any reordering is done by Intellij on it's own volition.

the second is that I believe that using the print statement has a whole lot more going on then writing directly to stdout. I think there is a buffer that stores the data and a thread (timer) gets started when data gets written to that buffer to flush the buffer. by using the code above bypasses that buffer and does what it is supposed to do. because of the flip flopping between using stdout and stderr with the print statement and the fact that any writes to stderr take a higher priority and by using the empty print (new line) with no flush after it causes that newline to be set on a timer as to when it will actually be sent to the console.

I actually suspect it's more of a timing issue to be honest. In reality print() shouldn't do much more than simply making sure a newline is added (unless you've set end=...). But I'm not ruling out that PyCharm is somehow wrapping print(), it seems a strange way to go but that's possible too.

To illustrate my point:

import sys

for _ in range(10):
    for i in range(10):
        sys.stderr.write(str(i))
        sys.stderr.flush()

    sys.stderr.write('\n')
    sys.stderr.flush()

    for i in range(10):
        sys.stdout.write(str(i))
        sys.stdout.flush()

    sys.stdout.write('\n')
    sys.stdout.flush()

image

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

wolph commented

It seems that the latests PyCharm versions have mitigated the issue enough that I think most people won't have to worry about it anymore. Let me know if anyone still has the issue, I'll reopen it :)