[Lds] ideas/recommendations for JSON-LD serializations
Adrian Pohl
pohl at hbz-nrw.de
Di Apr 24 16:48:43 CEST 2018
Hello Thomas.
On 4/23/18 3:55 PM, Thomas Gängler wrote:
> Hi Adrian,
>
> thanks a lot for your proper "pitch" for lobid ;)
:-)
>
> of course, we are a aware of the lobid services and it's also always a
> huge inspiration for us. However, right now for our use case we are
> looking for complete dumps of GND etc. - so yes, it's cool that lobid
> offers some of the proposed features, but since you do not offer dumps
> of your data, it's not applicable for or use case (where performance
> matters*).
Actually, you can use the lobid API to download dumps as gzip with the
gzip content header and using the format=bulk parameter, e.g. for ZDB data:
$ curl --header "Accept-Encoding: gzip"
"http://lobid.org/resources/search?q=inCollection.id%3A%22http%3A%2F%2Flobid.org%2Fresources%2FHT014846970%23%21%22&format=bulk"
> zdb.gz
See also the API documentation at [6].
And you can also fetch updates for a date range, e.g. by adding " AND
describedBy.dateModified:[20180403 TO
20180405]+OR+describedBy.dateCreated:[20180403 TO 20180405]" to the
query parameter.
Same goes for lobid-organisations but for lobid-gnd we still have to add
modification dates and bulk request support (see [7] & [8]).
All the best
Adrian
[7] https://github.com/hbz/lobid-gnd/issues/37
[8] https://github.com/hbz/lobid-gnd/issues/91
>
> Cheers,
>
>
> Thomas
>
>
> *) I know, this somehow destroys the vision of distributed data
> available via the web, but at the end performance matters (and then you
> often need the data locally available (e.g. via a search index))
>
>
> On 04/23/2018 01:19 PM, Adrian Pohl wrote:
>> Hello Thomas,
>>
>> I am responding as we also provide ZDB and GND data as well as data
>> from the German ISIL registry via lobid and already offer most of the
>> things you are asking for. So, you may want to give it a try.
>>
>> Generally, lobid-gnd is available via https://lobid.org/gnd and is
>> still in beta, amongst others because we haven't implemented yet
>> adding labels for embedded nodes (see [1] and its prerequisite [2]).
>>
>> ZDB data is available as part of lobid-resources at
>> https://lobid.org/resources. You have to filter by collection to get
>> all ZDB resources. [3] Note that the RDF representation of ZDB
>> resources, though very similar, differs from that by the ZDB itself,
>> see an annotated example at [4].
>>
>> And if you are also interested in ISIL data (Adressverzeichniss der
>> ZDB), then go to lobid-organisations: https://lobid.org/organisations.
>>
>> On 19.04.2018 14:14, Thomas Gängler wrote:
>>> Hello,
>>>
>>> currently, we process some JSON-LD dumps (e.g. ZDB and GND) from data
>>> available via DNB. Our observations while processing them are following:
>>>
>>> 1. It would be nice, if you could provide line-delimited JSON [1]
>>> records (instead of one large JSON object/array (as it is the case
>>> right now))
>>
>> We already provide JSON Lines [5] for lobid-organisations and
>> lobid-resources, see [6]. We will also add it for lobid-gnd.
>>
>>> 2. It would be nice, if the JSON-LD records could be provided in
>>> compact JSON-LD [2] (instead of the extended format (as it is the
>>> case right now)) + referenced @context* (instead of inline @context)
>>
>> All lobid services provide compacted JSON-LD with a referenced @context.
>>
>>> 3. It would be nice, if the (compact) JSON-LD records contain all
>>> sub-entities (i.e. there are no separate bnodes, but (if necessary)
>>> hierarchical entities), cf. [4], [5] or similar (instead of separate
>>> bnode objects in the same hierarchy level (as it is the case right now))
>>
>> lobid provides JSON-LD documents with one root node and with all other
>> nodes embedded in the hierarchy. (This was a main reason and major
>> improvement in our move from lobid 1.x to the new version.) We don't
>> add the whole data on an embedded node but only provide a label for
>> display purposes. (As said above, we are still working on implementing
>> this in lobid-gnd.) Further data must be fetched from the linked
>> resource.
>>
>>> We believe that all recommended changes will lead to a better
>>> usability of the provided JSON-LD data. Hence, we and probably other
>>> data consumers of the DNB datasets will be happy, if you could
>>> implement our proposed ideas.
>>
>> I agree.
>>
>> All the best
>> Adrian
>>
>> [1] https://github.com/hbz/lobid-gnd/issues/24
>> [2] https://github.com/hbz/lobid-gnd/issues/85
>> [3]
>> http://lobid.org/resources/search?q=inCollection.id%3A%22http%3A%2F%2Flobid.org%2Fresources%2FHT014846970%23%21%22&size=10
>>
>> [4] http://lobid.org/resources/api#periodikum
>> [5] http://jsonlines.org/
>> [6] http://lobid.org/resources/api#content_types
>>
>>>
>>> Best regards,
>>>
>>>
>>> Thomas
>>>
>>>
>>> *) referenced context requires that you probably need to provide/host
>>> the context documents at DNB (instead of, e.g., ZDB Github account [3])
>>>
>>>
>>> [1] https://en.wikipedia.org/wiki/JSON_streaming#Line_delimited_JSON
>>> [2] https://www.w3.org/TR/json-ld-api/#compaction
>>> [3] https://github.com/Zeitschriftendatenbank/jsonld-context
>>> [4] https://www.w3.org/Submission/CBD/
>>> [5] Minimum Spanning Graph:
>>> http://onlinelibrary.wiley.com/doi/10.1002/cpe.1623/pdf
>>> _______________________________________________
>>> lds mailing list
>>> lds at lists.dnb.de
>>> http://lists.dnb.de/mailman/listinfo/lds
>>
>
--
Adrian Pohl
hbz - Hochschulbibliothekszentrum des Landes NRW
Jülicher Straße 6
50674 Köln
Telefon +49-221-40075-235
http://www.hbz-nrw.de
Mehr Informationen über die Mailingliste lds