schemaorg/suggestions-questions-brainstorming

Propose pending terms to extend real estate listings: RealEstateListing, subtypes of offer

danbri opened this issue · 23 comments

I hereby propose a type analogous to JobPosting called "RealEstateListing". We do this sometimes (see also FAQPage), by having a type that groups together a common bundle of information. In the real estate case, this will give us a natural place for information about the listing itself rather than about the offer and the property being offered.

In looking at real estate it has also become clear that there is a widespread assumption that most offers are offers whose businessFunction is "Sell". While we encourage this view in the specification for https://schema.org/businessFunction ("The business function (e.g. sell, lease, repair, dispose) of the offer or component of a bundle (TypeAndQuantityNode). The default is http://purl.org/goodrelations/v1#Sell.") but it is awkward to depend heavily on defaults in any RDF-like language. I propose we explore introduction of subtypes of Offer for common business functions, at least SellOffer, BuyOffer and LeaseOffer. This is also much simpler in markup terms.

Hmm :-) As the person who initially proposed this modeling approach in GoodRelations, I would like to raise a few points:

Yes, having types for offers by the type of business function (sell, lease, ...) seems like a great simplification. But it makes processing of bundles with different business functions more difficult. And in general we see a shift away from simple product purchases to complex, specific bundles of rights on bundles of physical products, software, and services. You do not "buy" and iPhone, nor do you buy a song on iTunes in the legal sense of the word.

This could be mitigated in RDF worlds by non-standard axioms in the form of SPARQL CONSTRUCT rules that expand any SellOffer and LeaseOffer into a generic Offer with the proper businessFunction property.

By the way, note that "buy" as a business function for indicating demand has been superseded in both schema.org and GoodRelations by https://schema.org/Demand.

My main concern, however, is that such a markup-driven direction destroys much of the conceptual clarity of the Agent-Promise-Object-Compensation model in schema.org and GoodRelations, from where it originates.

Quote from the GoodRelations spec:
"The goal of GoodRelations is to define a data structure for e-commerce that is

industry-neutral, i.e. suited for consumer electronics, cars, tickets, real estate, labor, services, or any other type of goods,
valid across the different stages of the value chain, i.e. from raw materials through retail to after-sales services, and
syntax-neutral, i.e. it should work in microdata, RDFa, RDF/XML, Turtle, JSON, OData, GData, or any other popular syntax.
This is achieved by using just four entities for representing e-commerce scenarios:

An agent (e.g. a person or an organization),
An object (e.g. a camcorder, a house, a car,...) or service (e.g. a haircut),
A promise (offer) to transfer some rights (ownership, temporary usage, a certain license, ...) on the object or to provide the service for a certain
compensation (e.g. an amount of money), made by the agent and related to the object or service, and optionally
A location from which this offer is available (e.g. a store, a bus stop, a gas station,...).

This Agent-Promise-Object Principle can be found across most industries and is the foundation of the generic power of GoodRelations. It allows you to use the same vocabulary for offering a camcorder as for a manicure service or for the disposal of used cars."

Most standards for modeling e-commerce information before GoodRelations met only the requirements for a small set of products and services or industries; some for real estate, some for electronics, some for transportation. The use for these four or five essential types for modeling offers made it possible to use the very same vocabulary for almost any industry and business model, from selling camcorders against cold cash to offering piece of mind for good karma.

BTW, making "Sell" a default business function was a concession to an explicit Google request in the initial negotiations of GoodRelations adoption by Google back in 2010.

Addendum: I am not in general against adding a few selected offer subtypes so that the markup for typical sell and lease offers gets simpler in terms of markup, as long as this does not destroy the clean conceptual approach of e-commerce modeling in schema.org, and as long as this is based on a careful review of the implications.

Schema.org will not be forever, but I think it is worth the effort to keep its conceptual model as future-proof as possible by avoiding "markup-driven" quick-fixes; otherwise, the road from a generic vocabulary for the Web for a broad range of consumers to "just" a naming standard for a few major search engines is short. Which would be a pity to concede to lightheartedly.

One more: Subtypes of Offer based on the object being offered (like RealEstateOffer, CarOffer, BookOffer) are from hell, IMO.

If you want to expand the usage of schema.org for real estate, hotels, or new, used, or rental cars, it would be much, much better for the major search engines to take the readily available elements and examples from schema.org and turn them into official developer recipes:

Hotels: https://schema.org/docs/hotels.html
Real Estate: Same as with hotels, see e.g. https://schema.org/Apartment
New, used, or rental cars: https://schema.org/docs/automotive.html

It's all there. Just find a few stakeholders within each of the major organizations sponsoring schema.org and replicate the content as

https://developers.google.com/search/docs/data-types/Car
https://developers.google.com/search/docs/data-types/HotelRoom
https://developers.google.com/search/docs/data-types/Apartment

or similar, and you'll have a lever from here to the Moon ;-)

Sorry for the rant...

"One more: Subtypes of Offer based on the object being offered (like RealEstateOffer, CarOffer, BookOffer) are from hell, IMO."

Couldn't agree more.

As for the rest. Like @mfhepp already expressed, being able to express bundles of Offers with different methods of being able to express businessFunction becomes quite confusing IMHO as there will be multiple ways to do so.

As a counter suggestion, might it be an idea for Google to start promoting the use of businessFunction in its developer documentation?

If it would help, we could add the BusinessFunction types as schema.org enumerations.

Even though I'm sympathetic to the suggestion (made sense at first glance) I'm not sure it's worth creating new ways of being able to express something that will lead to confusement for the people that have to write the markup.

Especially if it hasn't even been tested yet if people can be guided to start using businessFunction.

Surely if people can succesfully create HowTo and FAQ markup they can add single 'businessFunction' line of markup, right?

The vast majority of authors use schema.org/Offer for selling, so if businessFunction is not set, was it an error or were they relying on the default? Even if we try to educate people, readers are left trying to figure out what to do with the data.

Well, in my personal experience as inhouse techhead, external consultant and teacher is that people tend to be completely clueless about 'businessFunction'.

Something which doesn't tend to be the case for people/organizations that are selling something as they don't feel they're missing anything.

Yet those that make offers with a different businessFunction often tend to be clueless they can express this.

So maybe starting with explaining folks they can express this would suffice. Not perse stating that it will be so but I find it too early to start modifying the vocabulary if the effort to explain people they can already do something hasn't even been made yet (myself not included).

Just my thoughts...

Initial thoughts.

RE subtyping Offer:
I understand the benefits of making simple things simple but beginning to subtype Offer based on the specific items being offered is slippery slope: it's only a matter of time before someone suggests more subtypes based on other attributes: e.g. businessFunction, areaServed, etc. The current model is general and flexible, and not complex if you know the logic behind it. I'd rather provide an official recipe laying out the data model and illustrating common use cases.

RE businessFunction and its default:
Most don't know about businessFunction in Offer, but again an official recipe would solve this. I'm not a fan of default values though. I understand the logic (i.e. user action & error minimization) but it's a double-edge sword.

@danbri can explain his proposal, but I read it as there would be an OfferForLease, OfferForRepair, etc and whether someone is offering real estate, cars, or furniture, they would use the OfferForLease type. I agree an AutoLeaseOffer is starting down a bad path.

First, thanks for the technical discussion. It is good that we are in agreement that a subtype hierarchy by the type of objects being offered is not a good path to take.

As for the business function being a property rather than using subtypes: The underlying rationale is that the business function actually defines the bundle of rights that the offering entity promises to transfer if a business transaction is carried out. This sound pretty academic, but even "straightforward" transaction like "buy", "lease" etc. can means that very different things in legal terms in different countries, industries, etc.

Now, using a property businessFunction and predefined individuals for the most common bundles of rights (buy, lease, dispose, ...) or rather global approximations thereof preserves the flexibility to define more specific bundles of rights.

The typical use-case is software and music: In there, the bundle of rights specification is the licensing agreement.

In GoodRelations, this is explicit by having an additional type

http://purl.org/goodrelations/v1#License

which is an allowed type for

http://purl.org/goodrelations/v1#hasBusinessFunction

This was also included in the official pull requests for incorporating GoodRelations into schema.org. But back then a few people at the search engine companies feared, if I remember correctly, that owners of Web content might abuse these constructs for putting constraints on search engine usage of Web content and then sue them on the basis of defining and then not obeying a license modeling pattern. (Strangely, we now have https://schema.org/sdLicense without any problems.).

This having said: Adding 2 - 3 Subtypes like OfferForPurchase and OfferForLease as shortcuts for common patterns could be a good choice, for it will likely increase the usage of markup in rental offers. But it should be complemented by an explanation that businessFunction is the real thing and can be used for more advanced scenarios.

Silently deprecating businessFunction by pushing these two types will strike back in the long run, for it kills the flexibility of the underlying GoodRelations model.

An I am with @nicolastorzec that adding a second way for modeling has a cost and that simply explaining the proper use of elements that habe been available in schema.org for eight years might be the more elegant and effective route. Developers are far from stupid; if Google publishes a 10-line example of using businessFunction for rental offer markup, they will understand....

Thanks everyone for all the discussion. Let's start by setting aside the things that we're not doing here, and look for common ground. Firstly, there's no proposal to create subtypes of offer for every pair of business function and itemOffered type; type-specific offer subtypes weren't proposed at all. The suggestion was rather to "pave the cowpaths" and give per-businessFunction named types for a rather small number of common cases, and still grounded in the same GR-based underlying conceptual model. Secondly, this wouldn't be "silently deprecating businessFunction". We could even write OWL (for those who use it) that explained the mapping. Indeed as @mfhepp says

This could be mitigated in RDF worlds by non-standard axioms in the form of SPARQL CONSTRUCT rules that expand any SellOffer and LeaseOffer into a generic Offer with the proper businessFunction property.

(BTW I apologize for the scrappy nature of the initial proposal, I had in fact written a more extensive note on the issues but somehow managed to delete it from my phone before posting.)

One key point is that "defaulting" is something that is not properly accounted for either in Schema.org's data model nor in Microdata or any of the related W3C RDF specifications (RDFS, JSON-LD, RDFa etc.). I suspect that we all share the underlying goal of reducing clutter, technical debt and other barriers to data usage, and are differing only on priorities. My proposal was intended as one that prioritizes publisher experience over data consumer experience over our experience as spec writers. The principles articulated in https://www.w3.org/TR/html-design-principles/#priority-of-constituencies are very much in the spirit of Schema.org. I think we can improve the markup experience and the data consumer experience without compromising the underlying simplicity of our core data model and modeling idioms.

In this project we have been very successful at getting large amounts of schema.org data into the Web, but it would be good to lower the barriers to adoption for consuming applications. Avoiding dependence on an undocumented and slightly ambiguous defaulting mechanism is part of that. I was prioritizing that over the cost of having (yet) another way of saying something that can already be expressed.

Without defaulting to sell, an Offer without a businessFunction is a bit like using schema.org/Action without distinguishing e.g. a BuyAction from SellAction, except that the name of "Offer" gently hints at the default, which is that the business function is to sell. Currently the only hint to data consumers of schema.org data that entities typed with Offer might have a "default" businessModel is the natural language wording in the businessOffer definition. We should probably also update the definition of Offer to mention the defaulting directly, since businessFunction is listed as being applicable also to Demand and to TypeAndQuantityNode. I don't know if every Demand or TypeAndQuantityNode that lacks a businessFunction should have a "sell" default added, but there are other issues with the logic of defaults that make me uncomfortable depending too heavily on such mechanisms. BTW the symmetry between Offer and Demand makes me unsure whether the default businessFunction on Demand would be "sell" or "buy"; @mfhepp can you suggest a clarification here?

The other point was around having a named type for RealEstateListing. The idea was to have a type that captures the overall intent of the graph data without expecting everyone to be running complex graph pattern matches against each and every page; again in the spirit of lowering burden on data consumers. I suggest we park such a design in the Pending area but take care not to move it into the Core without doing a review of its strengths and weaknesses, including documentation that ties it in with Offer. Martin - you would be very welcome to participate in that if you have time. There is also a potential link here to the notion of graph shapes as expressed in validation languages (shacl/shex) that might be worth investigating.

Would it make more sense to just add a https://schema.org/potentialAction to a
https://schema.org/Accommodation?

These already exist and RentAction even has a realEstateAgent property.
https://schema.org/SellAction
https://schema.org/RentAction

I do know from experience this is very awkward though.

I don't think Actions work terribly well for this use case as the listing may be published by a third-party site. For example, Craigslist isn't really a party in a RentAction for the apartments listed there.

Implemented in Release 4.0

Re-opening as there are more design-issues to talk through...

Hello - for RealEstateListing, I wanted to share that a home can be for sale, but have a different status, for example, a listing that is for sale can be Pending, and Contingent also.

There are other things that could be added that may or may not be already covered

  • last sold
  • year built
  • year renovated

I've drafted 3 types for review, published in our Pending area:

@danbri hello, do you know how it works the pending area?
What does it take to approve and start using and google understand?

thanks

I work for a commercial real estate company. Having some option to show that it's a real estate listing for commercial (business-zoned areas of towns and cities) would be great, because these buildings are not inhabitable by law, but are still for sale.

See issue #7 for the context of the move from the main Schema.org issue tracker to this repository.

Can this take into account specific housing types?

The reason I'm asking is because I'm currently building an app for my company that's powered by RETS. Based on the RETS API naming scheme, the creators took a lot of inspiration (I assume) from the Schema.org types, SingleFamilyResidence being one of them. Here's a sample response from their demo data API:

{
  "address": {
    "city": "The Woodlands",
    "country": "United States",
    "crossStreet": "Derry/Glen Erin",
    "full": "86242 South OCOTILLO CT Boulevard #2380",
    "postalCode": "77004",
    "state": "Texas",
    "streetName": "South OCOTILLO CT Boulevard",
    "streetNumber": 86242,
    "streetNumberText": "86242",
    "unit": "2380"
  },
  "agent": {
    "address": null,
    "contact": null,
    "firstName": "Jolie",
    "id": "jrivera",
    "lastName": "Rivera",
    "officeMlsId": null
  },
  "agreement": "Variable Rate",
  "association": {
    "amenities": "Club House,Community Pool,Garden/ Greenbelt/ Trails,Playground,Recreation Room,Sauna/ Spa/ Hot Tub",
    "fee": 1000,
    "frequency": null,
    "name": "SimplyRETS Home Owners Association"
  },
  "coAgent": {
    "address": null,
    "contact": null,
    "firstName": null,
    "id": "SR1234",
    "lastName": null,
    "officeMlsId": null
  },
  "disclaimer": "This information is believed to be accurate, but without warranty.",
  "geo": {
    "county": "West",
    "directions": "From 290 exit Barker Cypress to left on Tuckerton, right on Danbury Bridge, right on Bending Post, right on Driftwood Prairie",
    "lat": 29.889519,
    "lng": -95.48945,
    "marketArea": "Tanglewood Area"
  },
  "internetAddressDisplay": null,
  "internetEntireListingDisplay": null,
  "leaseTerm": null,
  "leaseType": "Modified Gross",
  "listDate": "1990-11-14T00:39:54.081857Z",
  "listPrice": 10815936,
  "listingId": "76723631",
  "mls": {
    "area": "Tanglewood Area",
    "areaMinor": null,
    "daysOnMarket": 521,
    "originalEntryTimestamp": null,
    "originatingSystemName": null,
    "status": "Pending",
    "statusText": ""
  },
  "mlsId": 1005174,
  "modified": "2010-08-19T08:36:36.158419Z",
  "office": {
    "brokerid": null,
    "contact": null,
    "name": null,
    "servingName": null
  },
  "originalListPrice": null,
  "ownership": null,
  "photos": [
    "https://s3-us-west-2.amazonaws.com/cdn.simplyrets.com/properties/trial/home4.jpg",
    "https://s3-us-west-2.amazonaws.com/cdn.simplyrets.com/properties/trial/home-inside-4.jpg"
  ],
  "privateRemarks": "This property is a trial property to test the SimplyRETS. Private agent remarks will be included in this field for use in the SimplyRETS REST API. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.",
  "property": {
    "accessibility": "Automatic Gate",
    "acres": null,
    "additionalRooms": "Bonus Room,Family Room,Inlaw / Rental Apartment,Inside Utility",
    "area": 3498,
    "areaSource": "PVA Office",
    "bathrooms": null,
    "bathsFull": 3,
    "bathsHalf": 6,
    "bathsThreeQuarter": null,
    "bedrooms": 6,
    "construction": "In Kitchen,Stackable,Washer Included",
    "cooling": null,
    "exteriorFeatures": "Back Yard Fenced, Patio/Deck, Sprinkler System",
    "fireplaces": 2,
    "flooring": null,
    "foundation": "Pier & Beam, Slab",
    "garageSpaces": 5.86283735365766,
    "heating": "Forced Air,Electric,Natural Gas,Wood,Fireplace",
    "interiorFeatures": "Breakfast Bar, Fire/Smoke Alarm, High Ceiling, Island Kitchen",
    "laundryFeatures": "Gas & Electric Dryer Hookup,Gas Dryer Hookup,Inside,Individual Room",
    "lotDescription": "Backs to Trees/Woods,Corner Lot,Level Lot,Suitable for Horses",
    "lotSize": "75X   100",
    "lotSizeArea": null,
    "lotSizeAreaUnits": null,
    "maintenanceExpense": null,
    "occupantName": null,
    "occupantType": null,
    "ownerName": null,
    "parking": {
      "description": "Parking Lot,2 Assigned,2 Unassigned,1-2 Step Entry",
      "leased": null,
      "spaces": 2
    },
    "pool": "Association",
    "roof": "Aluminum, Composition",
    "stories": 5,
    "style": "Colonial, Split Level",
    "subType": "SingleFamilyResidence",
    "subTypeText": null,
    "subdivision": "Barkley Square",
    "type": "RES",
    "view": "None",
    "water": null,
    "yearBuilt": 1967
  },
  "remarks": "This property is a trial property to test the SimplyRETS. This field will include remarks or descriptions from your RETS feed intended for public view. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.",
  "sales": {
    "agent": {
      "address": null,
      "contact": null,
      "firstName": "Echo",
      "id": "emayer",
      "lastName": "Mayer",
      "officeMlsId": "KWCO834"
    },
    "closeDate": "2000-10-19T11:49:01.836302Z",
    "closePrice": 9749229,
    "contractDate": null,
    "office": {
      "brokerid": "KWCO834",
      "contact": null,
      "name": "Keller Williams Greater Houston",
      "servingName": "Keller Williams Greater Houston"
    }
  },
  "school": {
    "district": null,
    "elementarySchool": "Greenwood",
    "highSchool": "CLMS",
    "middleSchool": "CIMARRON"
  },
  "showingContactName": null,
  "showingContactPhone": null,
  "showingInstructions": "The showing instructions for this trial property are brought to you by the SimplyRETS team. This field will include any showing remarks for the given listing in your RETS feed. Enjoy!",
  "specialListingConditions": null,
  "tax": {
    "id": "873-000-392-4123",
    "taxAnnualAmount": 6915,
    "taxYear": 1988
  },
  "terms": "Submit,Cash,Owner Will Carry,Trade",
  "virtualTourUrl": null
}

Currently, I use the appropriate hierarchy found under Accommodation when presenting an individual property in its own, standalone webpage. Is RealEstateListing going to be preferred for properties that are meant to be sold? Any insight would be much appreciated. Cheers!

Hello Everyone,
Stating to the specification published here https://schema.org/RealEstateListing there is something completely missing from my point of view.
The assumption "A RealEstateListing is a listing that describes one or more real-estate Offers (whose businessFunction is typically to lease out, or to sell). The RealEstateListing type itself represents the overall listing, as manifested in some WebPage." misses completely the fundamental part that a Real Estate Offer should have at least a price, a currency and a timing in case is a lease, rental or a sale.

I don't understand why this is missing and makes this entity completely useless. I was investigating various websites like Zillow and competitors and nobody uses the entity.

This should be the base but many more things are missing like the position (latitude, longitude or similars), the amenities, a full address eventually and the listing expiration at least.
Best regards

My company is a provider/developer of real estate websites for a large number of national and multinational real estate brokerages and brokerage franchises, and I found this issue as I'm currently building the ld+json script tag output for the next version of our search pages. A few notes:

DaveyJake mentioned RETS above, but that has been deprecated in favor of the RESO standard. (see: https://www.reso.org/rets-extinction-countdown/). I'm not sure how they differ, but any development here that attempts match non-schemaorg standards should probably use a current standard. Many Multiple Listing Services (MLS) are adopting RESO and our company as well as others are using it internally to coerce data to be comparable/compatible across MLSes.

The current draft of this proposal only adds two Properties beyond those inherited from other Types: datePosted and leaseLength. These are not really specific to real estate, and not of primary importance to most of our website users.

If a real estate specific Type proposal is to be adopted there are a lot of properties that could be more relevant but aren't included in this draft. You could probably do better by replacing both above properties with a single text Property for RESO schema based data (see below)--similar to the many properties in other types which contain an ISO standard compliant data. This avoids anyone on the schemaorg team having to develop a strong enough understanding of real estate to get the Properties of this Type correct while still effectively providing every Property that people in the real estate industry might value.

As things stand I'll almost certainly be using more generic types to describe the real estate data on my company's sites. Of what can be described using those types, I'll be focused on the specific properties that are referenced/recommended/used by major search engines. From my perspective, putting time into developing structured data output on the webpage itself is almost entirely done to boost SEO and enable search engine result set features like alternate-links. (e.g. when you search for "amazon towels" then, in addition to linking to amazon's search for towels, the first few results are indented as additional links underneath). We are not intending to make a majority of MLS data more easy to scrape from our websites, and in some cases we're legally or contractually prohibited from doing so anyway.

Hope this was helpful.

Please note that there is another open issue on this same topic here:

schemaorg/schemaorg#241

Perhaps we should close one of them and redirect to the other!