HPI-Information-Systems/Metanome

How to make new DBsource available?

Opened this issue · 10 comments

Hi, I am trying to connect to SQL Server and Presto. I did the following changes in my local. I am unable to connect to SQL Server.

  1. I changed enum and included SqlServer and Presto.
  2. Changed backend pom file for getting the drivers from Maven
  3. I had to add these two enum values to new_source.html as they are not included automatically

Is there any other configuration that I need to do for in order to achieve successful connection?
Thanks in advance.

Hi Allam,

it has actually been quite a while since we last tested and worked on the database connectivity. So before you proceed, you should test if the other connections to, for instance, a MySQL database still work. At the moment, we do not have the capacity to actively maintain the project (unfortunately) so some dependencies to still evolving external systems, such as the databases, might not work any more.

If the database connective in general still works, then your described changes should be fine. After all, Metanome simply uses Java's sql DriverManager to connect to a database via a provided URL:

this.dbConnection = DriverManager.getConnection(this.dbUrl, this.userName, this.password);

So maybe the URL is not provided in the expected form?
Also, do you get a specific error when you build Metanome?
Or does an error show up in the Metanome console output or in the Frontend when you make an connection attempt?

Best,
Thorsten

Hi Thorsten, Thanks for the reply. Interestingly no error message is showing up on screen but some error message comes up on the console. I will debug to see where that error message is being thrown. DefaultDatabaseConnection method error messages are not the ones that came up on console. It could be some version conflict. I will try MySQL today to see if my build actually works.

Hi Thorsten, Error on the console is "de.metanome.backend.resources.ParameterResource.retrieveParameters(java.lang.String), is not resolvable to a concrete type."

Mmmmh, that looks like a different issue. The parameter loading is part of the algorithm management and usually not related to the data input sources of the algorithms.
Metanome asks the algorithms for the parameters that they require to be executed. That call seems to fail in your setup. In the past, loading failures have often been caused by incompatible Metanome vs. Algorithms versions. In the most basic case, the algorithms implemented an older version of the Metanome interface. But they might also have build with different Java versions.
To debug this, you should build both Metanome and the algorithm(s) that you want to plug into your setup on the same host. After building Metanome, simply checkout the algorithm repo as well and run the build process also on that repo. Then use these algorithm jars for your testing.
If it still does not work, check when this error occurs. If it has something to do with the database connection, then the error should show up when the database input is registered or a profiling run is actually started.
To rule out the case that your issue is related to some very different problem, first try to run on a simple CSV file. Then, run on MySQL and, finally, test your own connection.

Thanks again. I already checked with CSV and it works. I used the input files that came along with the solution. I downloaded 1.2 release and the corresponding algorithm for testing. I also checked with mysql and the error is same. It is not an error it is warning in console. The full message is "WARNING: Return type, java.util.ArrayList<de.metanome.algorithm_integration.configuration.ConfigurationRequirement>, of method, public java.util.ArrayList> de.metanome.backend.resources.ParameterResource.retrieveParameters(java.lang.String), is not resolvable to a concrete type."
When the connection to MySQL is done, I expected Table Inputs (Choose 1) to be populated but it is not getting populated. I will try building one algorithm but it seems problem could be mainly in the DB connection. The warning message most probably is towards selecting an algorithm if not already selected.
Could you please check 1.2 release if it can connect to jdbc:mysql://localhost:3306/sakila? I installed mysql this morning on windows 10.

Hi Thorsten, If you could make out time for including few more DB connections such as SQLServer, Presto, Teradata, etc.,. and make small improvements for pwd hashing etc., it would help lot many users trying to profile bigdata. Hopefully, you could help harden this framework for big enterprises data governance.

"When the connection to MySQL is done, I expected Table Inputs (Choose 1) to be populated but it is not getting populated."
That is a good idea, but not something Metanome actually automatically does. It never got out of the research prototype stage and misses a ton of features that would actually make it a comfortable tool to use. This is one of this features. So to make it work, you need to specify the table(s) that you want to profile in the database manually. This is simply because there is no unified way to query the available tables and we did not take the effort to program all the different ways of doing retrieving these tables. I your case, this is good, because you don't need to implement that for SQLServer, Presto, Teradata, etc either. Once you created a connection, add another datasource, select table and the connection that you just created and then enter a table name. Then use that table as an input for some algorithm. This is the happy path that should work for all database systems. If that throws an error, we need to check again. We have database connections and database tables as inputs, because some algorithms can work on database connections directly and find the tables themselves. Other algorithms do not know databases at all, so that Metanome reads them and passes the records to the algorithm.

"If you could make out time for including few more DB connections"
Freeing up time for projects isn't really easier for researcher than for engineers in industry, unfortunately. We need to acquire the necessary funding before we can continue. But I am in contact with Felix and if the interest by funding agencies is high enough, you will see some development in the future ;-)

"So to make it work, you need to specify the table(s) that you want to profile in the database manually". Did you mean to specify table names manually from UI? I did not see an option of how to do it. If you meant a file to be edited, please provide details on which file needs to be edited. I will test against MySQL. Thanks for your continued support on this issue.

Sorry for the late response. You can add a table in the following way:

  1. Start the Metanome tool and open the UI in your webbrowser: http://localhost:8080/#/new
  2. Click the "+" sign in the data source panel.
  3. Click "TABLE INPUT".
  4. Select the database connection that your created earlier.
  5. Click the "Table Name" input field and enter the table name as it is listed in the database.
  6. (Optional) Enter a describing "Comment" for you table.
  7. Klick "Save"
  8. Select the table that you just created as an input for the next execution. You find it in the data source panel when you scroll to the bottom of the list.
    Hope this works for you :-)

Hello, I am trying this method, but still encountered some errors. Could you please help me solve my problem #436 ? Thanks in advance!