[Lds] ideas/recommendations for JSON-LD serializations

Adrian Pohl pohl at hbz-nrw.de
Di Apr 24 16:48:43 CEST 2018


Hello Thomas.

On 4/23/18 3:55 PM, Thomas Gängler wrote:
> Hi Adrian,
> 
> thanks a lot for your proper "pitch" for lobid ;)

:-)

> 
> of course, we are a aware of the lobid services and it's also always a 
> huge inspiration for us. However, right now for our use case we are 
> looking for complete dumps of GND etc. - so yes, it's cool that lobid 
> offers some of the proposed features, but since you do not offer dumps 
> of your data, it's not applicable for or use case (where performance 
> matters*).

Actually, you can use the lobid API to download dumps as gzip with the 
gzip content header and using the format=bulk parameter, e.g. for ZDB data:
$ curl --header "Accept-Encoding: gzip" 
"http://lobid.org/resources/search?q=inCollection.id%3A%22http%3A%2F%2Flobid.org%2Fresources%2FHT014846970%23%21%22&format=bulk" 
 > zdb.gz

See also the API documentation at [6].

And you can also fetch updates for a date range, e.g. by adding " AND 
describedBy.dateModified:[20180403 TO 
20180405]+OR+describedBy.dateCreated:[20180403 TO 20180405]" to the 
query parameter.

Same goes for lobid-organisations but for lobid-gnd we still have to add 
modification dates and bulk request support (see [7] & [8]).

All the best
Adrian

[7] https://github.com/hbz/lobid-gnd/issues/37
[8] https://github.com/hbz/lobid-gnd/issues/91

> 
> Cheers,
> 
> 
> Thomas
> 
> 
> *) I know, this somehow destroys the vision of distributed data 
> available via the web, but at the end performance matters (and then you 
> often need the data locally available (e.g. via a search index))
> 
> 
> On 04/23/2018 01:19 PM, Adrian Pohl wrote:
>> Hello Thomas,
>>
>> I am responding as we also provide ZDB and GND data as well as data 
>> from the German ISIL registry via lobid and already offer most of the 
>> things you are asking for. So, you may want to give it a try.
>>
>> Generally, lobid-gnd is available via https://lobid.org/gnd and is 
>> still in beta, amongst others because we haven't implemented yet 
>> adding labels for embedded nodes (see [1] and its prerequisite [2]).
>>
>> ZDB data is available as part of lobid-resources at 
>> https://lobid.org/resources. You have to filter by collection to get 
>> all ZDB resources. [3] Note that the RDF representation of ZDB 
>> resources, though very similar, differs from that by the ZDB itself, 
>> see an annotated example at [4].
>>
>> And if you are also interested in ISIL data (Adressverzeichniss der 
>> ZDB), then go to lobid-organisations: https://lobid.org/organisations.
>>
>> On 19.04.2018 14:14, Thomas Gängler wrote:
>>> Hello,
>>>
>>> currently, we process some JSON-LD dumps (e.g. ZDB and GND) from data 
>>> available via DNB. Our observations while processing them are following:
>>>
>>> 1. It would be nice, if you could provide line-delimited JSON [1] 
>>> records (instead of one large JSON object/array (as it is the case 
>>> right now))
>>
>> We already provide JSON Lines [5] for lobid-organisations and 
>> lobid-resources, see [6]. We will also add it for lobid-gnd.
>>
>>> 2. It would be nice, if the JSON-LD records could be provided in 
>>> compact JSON-LD [2] (instead of the extended format (as it is the 
>>> case right now)) + referenced @context* (instead of inline @context)
>>
>> All lobid services provide compacted JSON-LD with a referenced @context.
>>
>>> 3. It would be nice, if the (compact) JSON-LD records contain all 
>>> sub-entities (i.e. there are no separate bnodes, but (if necessary) 
>>> hierarchical entities), cf. [4], [5] or similar (instead of separate 
>>> bnode objects in the same hierarchy level (as it is the case right now))
>>
>> lobid provides JSON-LD documents with one root node and with all other 
>> nodes embedded in the hierarchy. (This was a main reason and major 
>> improvement in our move from lobid 1.x to the new version.) We don't 
>> add the whole data on an embedded node but only provide a label for 
>> display purposes. (As said above, we are still working on implementing 
>> this in lobid-gnd.) Further data must be fetched from the linked 
>> resource.
>>
>>> We believe that all recommended changes will lead to a better 
>>> usability of the provided JSON-LD data. Hence, we and probably other 
>>> data consumers of the DNB datasets will be happy, if you could 
>>> implement our proposed ideas.
>>
>> I agree.
>>
>> All the best
>> Adrian
>>
>> [1] https://github.com/hbz/lobid-gnd/issues/24
>> [2] https://github.com/hbz/lobid-gnd/issues/85
>> [3] 
>> http://lobid.org/resources/search?q=inCollection.id%3A%22http%3A%2F%2Flobid.org%2Fresources%2FHT014846970%23%21%22&size=10 
>>
>> [4] http://lobid.org/resources/api#periodikum
>> [5] http://jsonlines.org/
>> [6] http://lobid.org/resources/api#content_types
>>
>>>
>>> Best regards,
>>>
>>>
>>> Thomas
>>>
>>>
>>> *) referenced context requires that you probably need to provide/host 
>>> the context documents at DNB (instead of, e.g., ZDB Github account [3])
>>>
>>>
>>> [1] https://en.wikipedia.org/wiki/JSON_streaming#Line_delimited_JSON
>>> [2] https://www.w3.org/TR/json-ld-api/#compaction
>>> [3] https://github.com/Zeitschriftendatenbank/jsonld-context
>>> [4] https://www.w3.org/Submission/CBD/
>>> [5] Minimum Spanning Graph: 
>>> http://onlinelibrary.wiley.com/doi/10.1002/cpe.1623/pdf
>>> _______________________________________________
>>> lds mailing list
>>> lds at lists.dnb.de
>>> http://lists.dnb.de/mailman/listinfo/lds
>>
> 

-- 
Adrian Pohl
hbz - Hochschulbibliothekszentrum des Landes NRW
Jülicher Straße 6
50674 Köln
Telefon +49-221-40075-235
http://www.hbz-nrw.de


Mehr Informationen über die Mailingliste lds