Bug: Race condition on dependent steps leading to data loss
sGy1980de opened this issue · 0 comments
Hey guys,
awesome work on this project and thank you for providing it to the community. I'm pretty sure, I found a race condition in the executor though. I'll open a pull request to fix that issue, but let me explain first:
Given the following schemas (stripped down from our real application) and query:
Schema A
schema {
query: Query
}
interface Node {
id: ID!
}
type Query {
Accounts: AccountList!
}
type AccountList {
list: [Account!]
}
type Account implements Node {
id: ID!
accountId: String!
name: String!
campaigns: CampaignList
}
type CampaignList {
list: [Campaigns!]
}
type Campaigns implements Node {
id: ID!
campaignId: String!
name: String!
}
Schema B
schema {
query: Query
}
interface Node {
id: ID!
}
type Query {
node(id: ID!): Node
}
type Campaigns implements Node {
id: ID!
objective!
}
Query
query {
Accounts {
list {
accountId
name
campaigns {
list {
id
objective
}
}
}
}
}
When using gateway
to federate these two schemas, I ran into some weird situation, where the response was lacking some
results from the service providing Schema B
.
Got result
{
"Accounts": {
"list": [
{
"accountId": "de8e45b1-51f0-42ae-ba68-55fee29182cc",
"campaigns": {
"list": [
{
"id": "QWRzZXQ8MTc2MjQxMTI0Pg==",
"objective":"BRAND_AWARENESS"
},
{
"id": "QWRzZXQ8MTc2MjQwODk0Pg=="
}
]
},
"name": "Account Name #1"
},
{
"accountId": "6f2462ae-983b-4c32-a829-0e3ceb96b340",
"campaigns": {
"list": [
{
"id": "QWRzZXQ8MTY1MjA1NTI2Pg==",
"objective":"WEBSITE_VISIT"
},
{
"id": "QWRzZXQ8MTY1MjA1MDc2Pg=="
},
{
"id": "QWRzZXQ8MTY1MjAyOTY2Pg==",
"objective":"WEBSITE_VISIT"
},
{
"id": "QWRzZXQ8MTY1MjAyNzI2Pg=="
}
]
},
"name": "Account Name #2"
}
]
}
}
Expected result
{
"Accounts": {
"list": [
{
"accountId": "de8e45b1-51f0-42ae-ba68-55fee29182cc",
"campaigns": {
"list": [
{
"id": "QWRzZXQ8MTc2MjQxMTI0Pg==",
"objective":"BRAND_AWARENESS"
},
{
"id": "QWRzZXQ8MTc2MjQwODk0Pg==",
"objective":"BRAND_AWARENESS"
}
]
},
"name": "Account Name #1"
},
{
"accountId": "6f2462ae-983b-4c32-a829-0e3ceb96b340",
"campaigns": {
"list": [
{
"id": "QWRzZXQ8MTY1MjA1NTI2Pg==",
"objective":"WEBSITE_VISIT"
},
{
"id": "QWRzZXQ8MTY1MjA1MDc2Pg==",
"objective":"WEBSITE_VISIT"
},
{
"id": "QWRzZXQ8MTY1MjAyOTY2Pg==",
"objective":"WEBSITE_VISIT"
},
{
"id": "QWRzZXQ8MTY1MjAyNzI2Pg==",
"objective":"WEBSITE_VISIT"
}
]
},
"name": "Account Name #2"
}
]
}
}
In the logs I could not see any issues for the service though. A certain indicator of a race condition happening here
was, that disabling caches for the services kind of mitigates the issue completely, but not reliably.
So I was digging around and adding some own logging to find out what actually the issue is, since arbitrary breakpoints
didn't do the trick here. The log that got me on the right track finally was the following one.
Log Output
INFO[0023] response from dependent response="map[string]interface {}{\"objective\":\"WEBSITE_VISIT\"}" variables="map[id:QWRzZXQ8MTY1MjA1MDc2Pg==]"
INFO[0023] insert item insertion-point="[\"Accounts\",\"list:1\",\"campaigns\",\"list:1#QWRzZXQ8MTY1MjA1MDc2Pg==\"]" payload="{\"objective\":\"WEBSITE_VISIT\"}"
INFO[0023] response from dependent response="map[string]interface {}{\"objective\":\"WEBSITE_VISIT\"}" variables="map[id:QWRzZXQ8MTY1MjAyNzI2Pg==]"
INFO[0023] response from dependent response="map[string]interface {}{\"objective\":\"WEBSITE_VISIT\"}" variables="map[id:QWRzZXQ8MTY1MjAyOTY2Pg==]"
INFO[0023] insert item insertion-point="[\"Accounts\",\"list:1\",\"campaigns\",\"list:3#QWRzZXQ8MTY1MjAyNzI2Pg==\"]" payload="{\"objective\":\"WEBSITE_VISIT\"}"
INFO[0023] response from dependent response="map[string]interface {}{\"objective\":\"BRAND_AWARENESS\"}" variables="map[id:QWRzZXQ8MTc2MjQwODk0Pg==]"
INFO[0023] insert item insertion-point="[\"Accounts\",\"list:0\",\"campaigns\",\"list:1#QWRzZXQ8MTc2MjQwODk0Pg==\"]" payload="{\"objective\":\"BRAND_AWARENESS\"}"
INFO[0023] response from dependent response="map[string]interface {}{\"objective\":\"BRAND_AWARENESS\"}" variables="map[id:QWRzZXQ8MTc2MjQxMTI0Pg==]"
INFO[0023] response from dependent response="map[string]interface {}{\"objective\":\"WEBSITE_VISIT\"}" variables="map[id:QWRzZXQ8MTY1MjA1NTI2Pg==]"
INFO[0023] insert item insertion-point="[]" payload="{
\"Accounts\": {
\"list\": [
{
\"accountId\": \"de8e45b1-51f0-42ae-ba68-55fee29182cc\",
\"campaigns\": {
\"list\": [
{
\"id\": \"QWRzZXQ8MTc2MjQxMTI0Pg==\"
},
{
\"id\": \"QWRzZXQ8MTc2MjQwODk0Pg==\"
}
]
},
\"name\": \"Account Name #1\"
},
{
\"accountId\": \"6f2462ae-983b-4c32-a829-0e3ceb96b340\",
\"campaigns\": {
\"list\": [
{
\"id\": \"QWRzZXQ8MTY1MjA1NTI2Pg==\"
},
{
\"id\": \"QWRzZXQ8MTY1MjA1MDc2Pg==\"
},
{
\"id\": \"QWRzZXQ8MTY1MjAyOTY2Pg==\"
},
{
\"id\": \"QWRzZXQ8MTY1MjAyNzI2Pg==\"
}
]
},
\"name\": \"Account Name #2\"
}
]
}
}"
INFO[0023] insert item insertion-point="[\"Accounts\",\"list:1\",\"campaigns\",\"list:2#QWRzZXQ8MTY1MjAyOTY2Pg==\"]" payload="{\"objective\":\"WEBSITE_VISIT\"}"
INFO[0023] insert item insertion-point="[\"Accounts\",\"list:0\",\"campaigns\",\"list:0#QWRzZXQ8MTc2MjQxMTI0Pg==\"]" payload="{\"objective\":\"BRAND_AWARENESS\"}"
INFO[0023] insert item insertion-point="[\"Accounts\",\"list:1\",\"campaigns\",\"list:0#QWRzZXQ8MTY1MjA1NTI2Pg==\"]" payload="{\"objective\":\"WEBSITE_VISIT\"}"
One can clearly see that the system already inserted items from the dependent service, even before the result of the
main service has been inserted in the final result. This leads to the situation that those items will be discarded in
the final result.