unytics/bigfunctions

[bug]: `explore_column `: can't make visualization work

amirali-shfz opened this issue · 6 comments

Check the bug has not already been reported

Edit function_name and the short error description in title above

  • I wrote the correct function name and a short error description in the title above

What happened and what did you expect?

not sure if I'm calling the function correctly, but none of the explore calls create visualizations. Here's what I see when I run call bigfunctions.us.explore_column("bigfunctions.us.natality.weight_pounds"); select html from bigfunction_result;
image
image

When I put html in jsfiddle, I see this:
image

I asked chatgpt (lol) to fix the value and here's what I got.
old html:


                    <div class=""container"">
  <div class=""box"">
    <p class=""is-size-4 mb-6 mt-4"">Column <code>weight_pounds</code> <code>FLOAT64</code></p>
      <div class=""columns"">
        <div class=""column is-one-quarter"">
          <table class=""table is-narrow"">
            
              <tr><th class=""has-text-weight-bold"">min           </th><td>0.50044933474</td></tr>
              <tr><th class=""has-text-weight-bold"">mean          </th><td>7.312606681578967</td></tr>
              <tr><th class=""has-text-weight-bold"">max           </th><td>18.0007436923</td></tr>
            
            
              <tr><th class=""has-text-weight-bold"">distinct_count</th><td>5,727</td></tr>
              
            
            <tr><th class=""has-text-weight-bold"">missing_count </th><td >19,983</td></tr>
            
              <tr><th class=""has-text-weight-bold"">missing_ratio </th><td class=""has-background-danger-light"">0.2 %</td></tr>
            
            <tr><th class=""has-text-weight-bold"">row_count     </th><td>12,978,619</td></tr>
          </table>
        </div>
        <div class=""column"">
          
  <div><canvas id=""e6537ea7-1f58-4ba7-ad17-769ce9160ab7""></canvas></div>
  <script>
    (function() {
      new Chart(
        document.getElementById('e6537ea7-1f58-4ba7-ad17-769ce9160ab7'),
        {
            type: 'bar',
            data: {
              labels: [""[0.5, 2.25["",""[2.25, 4["",""[4, 5.75["",""[5.75, 7.5["",""[7.5, 9.25["",""[9.25, 11["",""[11, 12.75["",""[12.75, 14.5["",""[14.5, 16.25["",""[16.25, 18]""],
              datasets: [{
                label: 'value_count',
                data: [84161,186850,976204,5885362,5132205,668553,24143,950,126,82],
                borderColor: 'rgb(64, 81, 181)',
                backgroundColor: 'rgb(64, 81, 181)',
              }]
            },
            options: {}
          }
      );
    })();
  </script>

        </div>
      </div>
  </div>
</div>

new html

<div class="container">
  <div class="box">
    <p class="is-size-4 mb-6 mt-4">Column <code>weight_pounds</code> <code>FLOAT64</code></p>
    <div class="columns">
      <div class="column is-one-quarter">
        <table class="table is-narrow">
          <tr><th class="has-text-weight-bold">min           </th><td>0.50044933474</td></tr>
          <tr><th class="has-text-weight-bold">mean          </th><td>7.312606681578967</td></tr>
          <tr><th class="has-text-weight-bold">max           </th><td>18.0007436923</td></tr>
          <tr><th class="has-text-weight-bold">distinct_count</th><td>5,727</td></tr>
          <tr><th class="has-text-weight-bold">missing_count </th><td >19,983</td></tr>
          <tr><th class="has-text-weight-bold">missing_ratio </th><td class="has-background-danger-light">0.2 %</td></tr>
          <tr><th class="has-text-weight-bold">row_count     </th><td>12,978,619</td></tr>
        </table>
      </div>
      <div class="column">
        <div><canvas id="e6537ea7-1f58-4ba7-ad17-769ce9160ab7"></canvas></div>
        <!-- Include Chart.js library -->
        <script src="https://cdn.jsdelivr.net/npm/chart.js"></script>
        <script>
          (function() {
            new Chart(
              document.getElementById('e6537ea7-1f58-4ba7-ad17-769ce9160ab7'),
              {
                type: 'bar',
                data: {
                  labels: ["[0.5, 2.25[","[2.25, 4[","[4, 5.75[","[5.75, 7.5[","[7.5, 9.25[","[9.25, 11[","[11, 12.75[","[12.75, 14.5[","[14.5, 16.25[","[16.25, 18]"],
                  datasets: [{
                    label: 'value_count',
                    data: [84161,186850,976204,5885362,5132205,668553,24143,950,126,82],
                    borderColor: 'rgb(64, 81, 181)',
                    backgroundColor: 'rgb(64, 81, 181)',
                  }]
                },
                options: {}
              }
            );
          })();
        </script>
      </div>
    </div>
  </div>
</div>

Here is what I expected:
image

Hi @amirali-shfz.

I am very pleased you created this issue.

When I started bigfunctions more than a year ago I designed a bookmarlet (kind of chrome extension but easier) to render the html result of a function in BigQuery console such as this screenshot.

But:

  • BigQuery UI changed, then the bookmarklet did not work any more
  • Google released dataplex data profiling and then I was not sure that this function is still needed anymore.
  • More importantly, I came to think that functions should not create html, only data. If we want to render data in a UI we should use another tool.

So I gave up html rendering in BigQuery console.

Until now.

I'd be please to have your feedback:

  1. Is this function really interesting to you? Why not use dataplex data profiling?
  2. Would you use if it were accessible not from bigquery console but from another website such as https://cloud.unytics.io ?
  3. Would you be ready to authenticate with google on that website so that it can use your BigQuery credentials to perform the query for you before displaying the rendered html?
  1. I am new to dealing with data directly, and currently dealing with some very unclean tables. It'd allow me to understand a column quickly without doing multiple queries to check fill-rate, seeing some samples, etc. Honestly I'm likely butchering this function by calling it on a string column... Looking at dataplex, it seems to answer this issue actually nice pointer! I'll look into that.
  2. It would be fine if there was a link that I could click and see the graph elsewhere. Note that this is in work context and I'd have to ensure the security of my actions are aligned with company policy.
  3. Likely not for a work place use-case. I'd much rather get an html I could render myself

(btw awesome project I'll have to say)

Thanks @amirali-shfz for your answers. Very insightful.

A. It seems not possible to render html with charts in BigQuery console anymore
I tried to make my bookmarlet work again, but it seems that Google secured its page and I am not able to load external javascript anymore (I get the error "This document requires 'TrustedScriptURL' assignment."). Thus I am not able to load chart.js for charting any more.

B. Sending to cloud.unyics.io the result of the function
I understand that exposing your BigQuery credentials to cloud.unytics.io so that cloud.unytics.io make the queries for you is not safe enough for a work place. But sending the computed statistics in json to cloud.unytics.io is much preferable as you would have more control on what you send to cloud.unytics.io. Correct?

So one way to handle this would be:

  1. Run call compute_table_statistics('project.dataset.table')
  2. Run call send_bigfuntion_result_to_unytics().
  3. This remote function then stores the json + the fact that it is owned by your email in a unytics database.
  4. You can then go to cloud.unytics.io, sign in with Google Sign In (cloud.unytics.io only sees your email) and then you can see the rendered html. You can also delete the json from here.

What do you think? Would it be ok for you?

Hi @amirali-shfz I have good news!

I spent my evening on it and realized that it is actually still possible to create a bookmarklet.
(Unlike what I thought TrustedScriptURL stuff does not prevent to load external scripts, we just need to declare them safe).

I will modify the documentation of explore_column function to show you how to use it.

(there still are some adjusments to do but I'd be glad to check it works on your side already).

I'll let you know when the documentation is updated

Hi @amirali-shfz. It works!

The documentation is displayed on each function.

For example here for explore_column

Let me know if it works for you and I'll close this issue.

This works for me! This is awesome man! Thanks for fixing this up, super cool project.