gee-community/geemap

Histogram data output is empty when using a for loop

Remolkeman opened this issue · 5 comments

Environment Information

OS | Windows | CPU(s) | 16 | Machine | AMD64
Architecture | 64bit | RAM | 31.3 GiB | Environment | Jupyter
Python 3.11.9 (tags/v3.11.9:de54cf5, Apr 2 2024, 10:12:12) [MSC v.1938 64 bit (AMD64)]
geemap | 0.32.0 | ee | 0.1.398 | ipyleaflet | 0.18.2
folium | 0.16.0 | jupyterlab | 4.1.6 | notebook | 7.1.2
ipyevents | 2.0.2 | geopandas | 0.14.3 | localtileserver | 0.10.2

Description

I am trying to extract the ndvi pixel values from the histograms of multiple Sentinel-2 images and output them as a .csv file. Because there are a decent number of images, I created a for loop to iterate per image, providing the ndvi values from the histogram and saving them on csv. The issue is that the for loop outputs empty data(it actually outputs the bins but not the count); however, if it is executed line by line, it consistently outputs data. I used this example as the base of the code: #1032

What I Did

I have commented on the lines related to saving the data so that it is more comfortable to be run. I also wrote the prints to check the bins and count.

When running the loop this happends:

output_ndvi

If you run line by line:

line_by_line

I also tried to add a delay between each petition to the GEE because I thought that caused the empty data, but the result was the same.

# %%
import ee
import geemap.chart as chart
import pandas as pd

# %%
ee.Authenticate()
ee.Initialize()


# %%
def calculateNDVI(image):
    ndvi = image.normalizedDifference(["B8", "B4"])

    return ndvi.rename("NDVI")


# %%
geometry = ee.Geometry.Polygon(
    [
        [
            [-6.310153895594499, 36.89932544542902],
            [-6.30912392733278, 36.89637395041979],
            [-6.308093959071061, 36.892461327607315],
            [-6.304574900843522, 36.89294183594877],
            [-6.298480921961686, 36.894108772184815],
            [-6.293674403406999, 36.895618898477814],
            [-6.294618540980241, 36.89726627490264],
            [-6.297536784388444, 36.903855425007976],
            [-6.301399165369889, 36.90275727283513],
            [-6.305519038416764, 36.90207091970124],
            [-6.310411387659928, 36.90124728779126],
        ]
    ]
)

# %%
collection = (
    ee.ImageCollection("COPERNICUS/S2_HARMONIZED")
    .filterBounds(geometry)
    .filterDate("2015-01-01", "2022-01-15")
    .filterMetadata("CLOUDY_PIXEL_PERCENTAGE", "less_than", 10)
)

# %%
ndviCollection = collection.map(calculateNDVI)

# %%
# geemap.ee_export_image_collection_to_drive(ndviCollection, folder="GEE", scale=10)

# %%
my_sample = ndviCollection.toBands().sample(geometry, 10)

# %%
options = {
    "title": "NDVI",
    "xlabel": "NDVI",
    "ylabel": "n",
    "colors": ["#1d6b99"],
}


# %%
properties = my_sample.first().propertyNames().getInfo()

# %%
for prop in properties:
    hist_data = chart.feature_histogram(my_sample, prop, **options, show=False)
    
    print(hist_data.bins)
    print(hist_data.count)

    df = pd.DataFrame({"NDVI": hist_data.midpoints, "n": hist_data.count[1:]})

    # df.to_csv(f"{prop}.csv")

# %%

Have you tried using the zonal_stats function?
https://geemap.org/notebooks/12_zonal_statistics/

Hi!

Following your suggestion, I tried the zonal_stats function, and the following are the results

No for loop:

import ee
import geemap
import os

ee.Authenticate()
ee.Initialize()

def calculateNDVI(image):
    ndvi = image.normalizedDifference(["B8", "B4"])
    return ndvi.rename("NDVI")

geometry = ee.Geometry.Polygon(
    [
        [
            [-6.310153895594499, 36.89932544542902],
            [-6.30912392733278, 36.89637395041979],
            [-6.308093959071061, 36.892461327607315],
            [-6.304574900843522, 36.89294183594877],
            [-6.298480921961686, 36.894108772184815],
            [-6.293674403406999, 36.895618898477814],
            [-6.294618540980241, 36.89726627490264],
            [-6.297536784388444, 36.903855425007976],
            [-6.301399165369889, 36.90275727283513],
            [-6.305519038416764, 36.90207091970124],
            [-6.310411387659928, 36.90124728779126],
        ]
    ]
)

collection = (
    ee.ImageCollection("COPERNICUS/S2_HARMONIZED")
    .filterBounds(geometry)
    .filterDate("2015-01-01", "2022-01-15")
    .filterMetadata("CLOUDY_PIXEL_PERCENTAGE", "less_than", 10)
)

ndviCollection = collection.map(calculateNDVI)

geometry_feature = ee.FeatureCollection(geometry)
scale = 10

out_ndvi_stats = os.path.join(os.path.expanduser("~"), "Downloads", "ndvi_stats.csv")
geemap.zonal_stats(ndviCollection, geometry_feature, out_ndvi_stats, stat_type='MEAN', scale=scale)

geemap.create_download_link(out_ndvi_stats)


In this case, it does output a csv with the mean values properly and not empty.

"
Computing statistics ...
Generating URL ...
Downloading data from https://earthengine.googleapis.com/v1/projects/earthengine-legacy/tables/34324237bcb4dd798c95c943ae4842e8-3f0985e16b05496e4c0d4373a875a4c0:getFeatures
Please wait ...
Data downloaded to C:\Users\user\Downloads\ndvi_stats.csv
"

However, when using a for loop, it does not work properly.

import ee
import geemap
import os

ee.Authenticate()
ee.Initialize()

def calculateNDVI(image):
    ndvi = image.normalizedDifference(["B8", "B4"])
    return ndvi.rename("NDVI")

geometry = ee.Geometry.Polygon(
    [
        [
            [-6.310153895594499, 36.89932544542902],
            [-6.30912392733278, 36.89637395041979],
            [-6.308093959071061, 36.892461327607315],
            [-6.304574900843522, 36.89294183594877],
            [-6.298480921961686, 36.894108772184815],
            [-6.293674403406999, 36.895618898477814],
            [-6.294618540980241, 36.89726627490264],
            [-6.297536784388444, 36.903855425007976],
            [-6.301399165369889, 36.90275727283513],
            [-6.305519038416764, 36.90207091970124],
            [-6.310411387659928, 36.90124728779126],
        ]
    ]
)

collection = (
    ee.ImageCollection("COPERNICUS/S2_HARMONIZED")
    .filterBounds(geometry)
    .filterDate("2015-01-01", "2022-01-15")
    .filterMetadata("CLOUDY_PIXEL_PERCENTAGE", "less_than", 10)
)

ndviCollection = collection.map(calculateNDVI)


geometry_feature = ee.FeatureCollection(geometry)


scale = 10

out_dir = os.path.join(os.path.expanduser("~"), "Downloads")

if not os.path.exists(out_dir):
    os.makedirs(out_dir)

for i, image in enumerate(ndviCollection.toList(ndviCollection.size())):
    image = ee.Image(image)
    out_ndvi_stats = os.path.join(out_dir, f"ndvi_stats_{i}.csv")
    geemap.zonal_stats(image, geometry_feature, out_ndvi_stats, stat_type='MEAN', scale=scale)

"
ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))"
}
"
What I am looking for is to extract the ndvi value of each pixel in each image of the collection inside the designed geometry. As I could check, this function returns mean, std, sum, etc. It seems that something is wrong in the for loop, and I might be failing to properly program it.

Use collection.toBands() instead of a for loop.

import ee
import geemap
import os

ee.Authenticate()
ee.Initialize()

def calculateNDVI(image):
    ndvi = image.normalizedDifference(["B8", "B4"])
    return ndvi.rename("NDVI")

geometry = ee.Geometry.Polygon(
    [
        [
            [-6.310153895594499, 36.89932544542902],
            [-6.30912392733278, 36.89637395041979],
            [-6.308093959071061, 36.892461327607315],
            [-6.304574900843522, 36.89294183594877],
            [-6.298480921961686, 36.894108772184815],
            [-6.293674403406999, 36.895618898477814],
            [-6.294618540980241, 36.89726627490264],
            [-6.297536784388444, 36.903855425007976],
            [-6.301399165369889, 36.90275727283513],
            [-6.305519038416764, 36.90207091970124],
            [-6.310411387659928, 36.90124728779126],
        ]
    ]
)

collection = (
    ee.ImageCollection("COPERNICUS/S2_HARMONIZED")
    .filterBounds(geometry)
    .filterDate("2015-01-01", "2022-01-15")
    .filterMetadata("CLOUDY_PIXEL_PERCENTAGE", "less_than", 10)
)

ndviCollection = collection.map(calculateNDVI)


geometry_feature = ee.FeatureCollection(geometry)


scale = 10

out_dir = os.path.join(os.path.expanduser("~"), "Downloads")

if not os.path.exists(out_dir):
    os.makedirs(out_dir)

image = ndviCollection.toBands()

out_ndvi_stats = "ndvi.csv"
geemap.zonal_stats(image, geometry_feature, out_ndvi_stats, stat_type='MEAN', scale=scale)

Hi Qiusheng Wu!

It is very likely that I have not properly explained the problem I am facing. To that end, I want to give you some context to help with your understanding:

I am currently doing a final master thesis, in which I am applying common remote sensing techniques such as NDVI, NDMI, SAVI..... on several delimited areas within the Guadalquivir estuary(Spain). Thus far, I have created a time series that represents the mean value of the indices. What I am currently trying to do is to use each satellite image within these time series and represent the value of each pixel for NDVI, NDMI, etc. in a histogram, and thus check the distribution of the data outside a mean. For example, imagine a Sentinel-2 image for the date 25/12/2015. In this case, I want to know the NDVI value of each pixel in that defined area, represent it in a histogram, and check the distribution of this data for that particular day of that year. In the case of my study, I would have approximately 2000+ images between the areas. Therefore, I want to apply a for loop that uses these NDVI pixel values and represent them in a histogram for each of these images and, if possible, save the NDVI(I write NDVI, but it would be for each index) pixel values in a csv. In short, I am not looking for mean values, but the pixel values.

When I tried to apply this loop, I encountered the problem reported above, returning an empty output for using the loop.