Azure/azure-functions-java-library

Time out when i access Azure Data Lake Gen2 in Azure Function with ADLS java sdk V12

gjjtip opened this issue · 16 comments

My function code as below:

package org.example;

import com.azure.core.http.rest.PagedIterable;
import com.azure.storage.common.StorageSharedKeyCredential;
import com.azure.storage.file.datalake.DataLakeFileSystemClient;
import com.azure.storage.file.datalake.DataLakeServiceClient;
import com.azure.storage.file.datalake.DataLakeServiceClientBuilder;
import com.azure.storage.file.datalake.models.ListPathsOptions;
import com.azure.storage.file.datalake.models.PathItem;
import com.microsoft.azure.functions.*;
import com.microsoft.azure.functions.annotation.AuthorizationLevel;
import com.microsoft.azure.functions.annotation.FunctionName;
import com.microsoft.azure.functions.annotation.HttpTrigger;

import java.util.Iterator;
import java.util.Optional;

/**
 * Azure Functions with HTTP Trigger.
 */
public class Function {

    public static String accountName = "***";
    public static String accountKey = "***";

    @FunctionName("HttpExample")
    public HttpResponseMessage run(
            @HttpTrigger(name = "req", methods = {HttpMethod.GET, HttpMethod.POST}, authLevel = AuthorizationLevel.ANONYMOUS) HttpRequestMessage<Optional<String>> request,
            final ExecutionContext context) {
        context.getLogger().info("Java HTTP trigger processed a request.+++++");

        DataLakeServiceClient dataLakeServiceClient = GetDataLakeServiceClient(accountName, accountKey);
        DataLakeFileSystemClient dataLakeFileSystemClient = GetFileSystem(dataLakeServiceClient);

        String name = ListFilesInDirectory(dataLakeFileSystemClient, context);

        if (name == null) {
            return request.createResponseBuilder(HttpStatus.BAD_REQUEST).body("Please pass a name on the query string or in the request body").build();
        } else {
            return request.createResponseBuilder(HttpStatus.OK).body("List File Names:, " + name).build();
        }
    }

    static public DataLakeServiceClient GetDataLakeServiceClient
            (String accountName, String accountKey) {

        StorageSharedKeyCredential sharedKeyCredential =
                new StorageSharedKeyCredential(accountName, accountKey);

        DataLakeServiceClientBuilder builder = new DataLakeServiceClientBuilder();

        builder.credential(sharedKeyCredential);
        builder.endpoint("https://" + accountName + ".dfs.core.windows.net");

        return builder.buildClient();
    }

    static public DataLakeFileSystemClient GetFileSystem
            (DataLakeServiceClient serviceClient) {

        return serviceClient.getFileSystemClient("test");
    }

    static public String ListFilesInDirectory(DataLakeFileSystemClient fileSystemClient, ExecutionContext context) {

        ListPathsOptions options = new ListPathsOptions();
        options.setPath("");
        fileSystemClient.listPaths().forEach( path -> context.getLogger().info(path.getName()));

        return "ABC";
    }
}

The java sdk link: https://azuresdkdocs.blob.core.windows.net/$web/java/azure-storage-file-datalake/12.0.0-preview.6/index.html

Based on my own troubleshooting, the code always hangs on the line :
fileSystemClient.listPaths().forEach( path -> context.getLogger().info(path.getName()));

No any responses here. I tested above code outside Azure function, let's say the Main method,everything works. So i think the code is fine. Anyone met the same issue as same as me. Is there any restrictions in the Azure Function to access ADLS gen2 with java sdk? Please help.

I have the same issue ..
I can get the FileSystemClient and DirectoryClient objects but time out checking the existence whatever the value of this timeout:

DataLakeFileSystemClient adlsfsClient = adlsClient.getFileSystemClient(fileSystem);
DataLakeDirectoryClient adlsRootDirClient = adlsfsClient.getDirectoryClient("myrootdir");
Duration timeout = Duration.ofMillis(10000);
if (!adlsRootDirClient.existsWithResponse(timeout, Context.NONE).getValue()) {
    adlsRootDirClient.create();
}

Same as you, running this code from my local PC works fine !!

I've ported this GitHub issue to the azure/azure-sdk-for-java repo where the Storage team will be able to follow up, shortly.

Azure/azure-sdk-for-java#13846

Please can you follow this issue steps and check if the issue still exists
Azure/azure-functions-java-worker#381

Internal Tracking devdivcsef 371565

@gjjtip Please let us know if you still face this issue. Thank you so much!

i can confirm that this issue still exists.
using dependency 12.2.0 for azure datalake the function freezes while running the function app if a createdirectory is called.
Tried also with blob storage sdk for ADLS.Face the same issue when trying to use the new Xssfworkbook(blobclient.inputstream());
Here the function app just freezes and time out giving 503 service unavailable whereas at the same time it works fine if run via a java main function locally.
now using azure cloud storage sdk serves as a workaround to read the stream but its not a solution as i need to list the paths in the container/directory

@s3vhub Did you try to add the application setting FUNCTIONS_WORKER_JAVA_LOAD_APP_LIBS True or 1?
Please can you share more information related to the run time version, application name?

yes i tried this today by setting the value to 1 and it threw a lot if errors as if the jars werent loading.
the jdk version is jdk8.
runtime version is 3 on the function app and the application name is xxx-parser but it doesnt seem to be application name specific

It also happens with the azure blob storage sdk that i tried with.

@s3vhub Did you try to add the application setting FUNCTIONS_WORKER_JAVA_LOAD_APP_LIBS True or 1?
Please can you share more information related to the run time version, application name?

please elaborate as to what more info you will need?

Thank you @s3vhub we will need these information, this will be helpful.
Please provide the following:

  • Timestamp:
  • Function App name:
  • Function name(s) (as appropriate):
  • Invocation ID:
  • Region:

For the runtime version, in the UI you will see it 3.x.x, in the over view section. Is it linux/windows? And if it is dedicated/premium or consumption.
Can you share what is the errors you had when you set the value 1?

Thank you @s3vhub we will need these information, this will be helpful.
Please provide the following:

  • Timestamp:
  • Function App name:
  • Function name(s) (as appropriate):
  • Invocation ID:
  • Region:

Can you share what is the errors you had when you set the value 1?

  1. since this environment belongs to the customer i can share the timestamp and region and the app name partially.would that be enough?
  2. However i have a question.Does the above mentioned application setting work for the azure-blob-storage sdk as well that i can use to read from the ADLS or is it specific to the datalaske gen 2 sdk?

This setting basically, as mentioned in issue Azure/azure-functions-java-worker#381
will make sure that we will not have conflict between SDK jars and Azure functions.

If you can share region, with invocation Id, and partial app name we can check how much information we can gather.

This setting basically, as mentioned in issue Azure/azure-functions-java-worker#381
will make sure that we will not have conflict between SDK jars and Azure functions.

If you can share region, with invocation Id, and partial app name we can check how much information we can gather.

so for azure blob storage sdk i have added
the FUNCTIONS_WORKER_JAVA_LOAD_APP_LIBS = True.
i had earlier added the value as 1 which was throwing the error.However trying out with the value as True seems to have fixed the issue.
used - java 8
azure func version 3
azure-storage-blob 12.6.0

app service plan for the function app
I have yet to try it for datalake sdk but i can confirm it works for blob storage sdk to read from the blob in ADLS created like container/dir1/dir2/yyyy/MM/dd/file.xls.

Thank you @s3vhub, please if you can retry the same working app with 1 again? As both should work https://github.com/Azure/azure-functions-java-worker/blob/dev/src/main/java/com/microsoft/azure/functions/worker/Util.java#L5

Please let us know once you confirm the datalake.

Thank you so much for your support!

sorry for the delay.will check and update

I am seeing this issue with Java 11 as well. I have tried setting FUNCTIONS_WORKER_JAVA_LOAD_APP_LIBS to true / 1.
Any idea if there is a fix available for this yet?

OS = Linux
Java runtime = 11.

Maven dependency for azure-storage-file-datalake = 12.20.0

<dependency>
            <groupId>com.azure</groupId>
            <artifactId>azure-storage-file-datalake</artifactId>
            <version>12.20.0</version>
        </dependency>