damiencarol/jsr203-hadoop

Support for FileChannel

Opened this issue · 5 comments

I'm having trouble using this API with SSHD-Core mainly because of the lack of public FileChannel newFileChannel(Path path, Set<? extends OpenOption> options, FileAttribute<?>... attrs) throws IOException method on HadoopFileSystemProvider.

I'm trying to create an implementation but I am a bit seb-back by my lack of knowledge of this API.

Regards,

Olivier.

Could you give me some exemples of how you do it?

I'm investigating what you want to do with SSHD-Core.
I will try find what SSHD-Code use in a FileSystemProvider implementation.
WIP

@ogirardot I made some tests with an embedded FTP server. It seems to work pretty well :)
But I saw that there is a performance bottleneck. jsr203hadoop implementation make to much call to getFileInfo

I will make some changes for the next version. Anyway I have a great tool now to test the implementation.

Like this:

FileZilla -> Custom Java Program that use FtpServer -> Nio -> jsr203hadoop -> HDFS

I don't know how you use NIO with SSHD but if the MINA FTP server use NIO in the same way, I will be able to debug it and fix the pb.

2015-12-24 18:18:58 DEBUG IODataConnectionFactory:298 - Opening active data connection
2015-12-24 18:18:58 DEBUG IODataConnectionFactory:314 - Binding active data connection to /127.0.0.1:2424
2015-12-24 18:18:58 DEBUG Client:1024 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol sending #993
2015-12-24 18:18:58 DEBUG Client:1081 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol got value #993
2015-12-24 18:18:58 DEBUG ProtobufRpcEngine:253 - Call: getFileInfo took 92ms
2015-12-24 18:18:58 DEBUG Client:1024 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol sending #994
2015-12-24 18:18:58 DEBUG Client:1081 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol got value #994
2015-12-24 18:18:58 DEBUG ProtobufRpcEngine:253 - Call: getFileInfo took 89ms
2015-12-24 18:18:58 DEBUG Client:1024 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol sending #995
2015-12-24 18:18:58 DEBUG Client:1081 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol got value #995
2015-12-24 18:18:58 DEBUG ProtobufRpcEngine:253 - Call: getFileInfo took 90ms
2015-12-24 18:18:58 DEBUG Client:1024 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol sending #996
2015-12-24 18:18:58 DEBUG Client:1081 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol got value #996
2015-12-24 18:18:58 DEBUG ProtobufRpcEngine:253 - Call: getListing took 94ms
2015-12-24 18:18:58 DEBUG Client:1024 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol sending #997
2015-12-24 18:18:58 DEBUG Client:1081 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol got value #997
2015-12-24 18:18:58 DEBUG ProtobufRpcEngine:253 - Call: getFileInfo took 92ms
2015-12-24 18:18:58 DEBUG Client:1024 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol sending #998
2015-12-24 18:18:58 DEBUG Client:1081 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol got value #998
2015-12-24 18:18:58 DEBUG ProtobufRpcEngine:253 - Call: getFileInfo took 93ms
2015-12-24 18:18:58 DEBUG Client:1024 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol sending #999
2015-12-24 18:18:58 DEBUG Client:1081 - IPC Client (1473611564) connection to ns311426.ip-188-165-198.eu/188.165.198.145:8020 from dcarol got value #999
2015-12-24 18:18:58 DEBUG ProtobufRpcEngine:253 - Call: getFileInfo took 96ms

I'm having trouble using this API with SSHD ( apache mina sshd ) , it have a EOF Exception.

i do it like under:

sshd.setFileSystemFactory(new HdfsSftpFileSystemFactory());

public class HdfsSftpFileSystemFactory implements FileSystemFactory {
@OverRide
public FileSystem createFileSystem(Session session) throws IOException {

	return new HadoopFileSystem(new HadoopFileSystemProvider(), "127.0.0.1", 9000);
}

}

when i download file, it have a EOF Exception .

HadoopFileSystem$3 [position]
java.io.EOFException: Cannot seek after EOF
at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1576)
at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:65)
at org.javastack.hdfsserver.filesystem.HadoopFileSystem$3.position(HadoopFileSystem.java:534)
at org.javastack.hdfsserver.filesystem.HadoopFileChannel.position(HadoopFileChannel.java:71)
at org.apache.sshd.server.subsystem.sftp.FileHandle.read(FileHandle.java:133)
at org.apache.sshd.server.subsystem.sftp.SftpSubsystem.doRead(SftpSubsystem.java:2087)
at org.apache.sshd.server.subsystem.sftp.SftpSubsystem.doRead(SftpSubsystem.java:2059)
at org.apache.sshd.server.subsystem.sftp.SftpSubsystem.process(SftpSubsystem.java:468)
at org.apache.sshd.server.subsystem.sftp.SftpSubsystem.run(SftpSubsystem.java:418)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

I think you should not instanciate HadoopFileSystemProvider. The NIO API do it for you.