Bunsly/HomeHarvest

Get other attribute information

Closed this issue · 7 comments

Dear ZacharyHampton,

Thank you very much for your work.
If I want to get other attribute information, such as "Construction Materials", "Fireplace Features", "Cooling Features" and "Heating Features", how can I do it? I successfully crawled some data using the code "HomeHarvest_Demo" you provided, but I can't find a place to add the above attribute information. Since I am a beginner in python, I am not sure whether I can achieve this by modifying the "utils.py" in the folder "homeharvest".

Best wishes!
Yanruoxi001

If you'd like to add those fields, you would have to find and add the associated fields to the graphql statements we have, then edit the utils.py to add them to the df.

If you want, we can take a look for you and add them.

Thank you very much! We hope to explore the relationship between house materials and airtightness and indoor pollution. If possible, we hope to add the following fields: Construction Materials, Roofing, Ceilings, Flooring, Foundation Details, Cooling Features, Heating Features, Fireplace Features, Number of Fireplaces, and Water Source.

If you'd like to add those fields, you would have to find and add the associated fields to the graphql statements we have, then edit the utils.py to add them to the df.

If you want, we can take a look for you and add them.

@ZacharyHampton what's your strategy of finding the associated fields? I'm a bit new to graphql, but I'd be happy to help with this.

I'm also trying to find similar details as well as many of the contents of the "Property Details" section from realtor.com

update I've found tags as a field which includes a lot of this info, but it seems less detailed than the "Property Details" section from a listing. For instance, theres a swimming pool tag, but the property details may contain more information such as whether the pool is in-ground

Sorry to bother you, is it possible to add these properties? Because we cannot get these properties from the latest version. These properties include Construction Materials, Roofing, Ceilings, Flooring, Foundation Details, Cooling Features, Heating Features, Fireplace Features, Number of Fireplaces, and Water Source.

How is schema being fetched? I tried with
SEARCH_GQL_URL = "https://www.realtor.com/api/v1/rdc_search_srp?client_id=rdc-search-new-communities&schema=vesta" and with the session token used for queries but no luck.
Also curious if the features info are even available through GQL because from realtor.com it seems that detail pages are pre-rendered with those info. They are not a part of their GQL queries either

@liamzhang40 The schema is being fetched through that endpoint with gql introspection.

Open specific issue if you want a a particular attribute