GoogleCloudPlatform/DataflowPythonSDK

Logs not directed to the correct place

srossross opened this issue · 2 comments

I have a bunch of ParDo(DoFuncs) in my beam pipeline which are not being correctly setup in Dataflow. it works fine when I test locally, but not when I run on Dataflow with streaming.

my pipeline is created like this:

class SplitTime(beam.DoFn):
    def __init__(self, minutes=60):
        self.minutes = minutes

    def process(self, element):
          ....
....
activity_chunks = activities | 'Split Into 15 minute Chunks' >> beam.ParDo(SplitTime(minutes=15))

I'm creating a template like this:

python functions/beam2.py --runner DataflowRunner  --project my-great-project --staging_location gs://test-bucket/stage --temp_location gs://test-bucket/temp   --setup_file functions/setup.py --template_location gs://test-bucket/templates/27/activity5

and running like this:

gcloud dataflow jobs run template-27-5 --gcs-location=gs://test-bucket/templates/27/activity5

None of the logs are showing up in the dataflow UI. From stackdriver I can see my logs

2019-07-23T15:56:50.955574989Z No unique name set for transform generatedPtransform-2494 I 
2019-07-23T15:56:50.972083091Z No unique name set for transform generatedPtransform-2492 I 
2019-07-23T15:56:50.980607986Z No unique name set for transform -2482 I 

How do I enforce a unique name for transform?

The log metadata from output that I know is in that that step:

{
 insertId:  "5616980600106424110:3873:0:60547"  
 jsonPayload: {}  
 labels: {}  
 logName:  "projects/my-great-project/logs/dataflow.googleapis.com%2Fworker"  
 receiveTimestamp:  "2019-07-23T15:41:24.085630441Z"  
 resource: {
  labels: {
   job_id:  "2019-07-23_08_37_19-14491780021717194051"    
   job_name:  "template-27-5"    
   project_id:  "my-great-project"    
   region:  "us-central1"    
   step_id:  ""    
  }
  type:  "dataflow_step"   
 }
 severity:  "ERROR"  
 timestamp:  "2019-07-23T15:41:06.450673103Z"  
}

Same issue is reported also in here https://issues.apache.org/jira/browse/BEAM-7934 . I will close this in order to track the issue in one place.