AW: [Metafacture] how to access a specific value (index/occurrence)
after applying a filter in an entity
Böhme, Christoph
C.Boehme at dnb.de
Fri May 23 16:54:20 CEST 2014
Hi Thomas,
Sorry for only replying now to your email. I hope my answer is still helpful.
Do I understand correctly that the output that you want to get is
ok2= Roll, Gernot
ok2= Thomas, Eugen
but instead you only get
ok2= Roll, Gernot?
The occurrence filter can be reset on changes of the entity. I think, you could use this to solve your problem:
<combine name="ok" value="${out}" sameEntity="true">
<data source="feld.ind">
<equals string="p" />
</data>
<data source="feld.nr">
<equals string="077" />
</data>
<data name="out" source="feld.value">
<occurrence only="2" sameEntity="true" />
</data>
</combine>
I am not a 100% sure that the above code works since I do not know how you convert the mabxml into Metamorph stream events. I suppose you use the GenericXmlHandler for parsing your mabxml into Metamorph stream events. Did you consider writing a specific handler for mabxml? This probably makes sense as the stream structure you currently has does not follow the actual structure of the mab format but instead follows the structure of its xml representation. This makes it quite hard to process.
Best,
Christoph
--
***Lesen. Hören. Wissen. Deutsche Nationalbibliothek***
Christoph Böhme
Deutsche Nationalbibliothek
Informationstechnik
Adickesallee 1
D-60322 Frankfurt am Main
Telefon: +49-69-1525-1721
Telefax: +49-69-1525-1799
mailto:c.boehme at dnb.de
http://www.dnb.de
> -----Ursprüngliche Nachricht-----
> Von: metafacture-bounces at lists.dnb.de [mailto:metafacture-
> bounces at lists.dnb.de] Im Auftrag von Thomas Gängler
> Gesendet: Mittwoch, 23. April 2014 10:32
> An: metafacture at lists.dnb.de
> Betreff: [Metafacture] how to access a specific value (index/occurrence)
> after applying a filter in an entity
>
> Hi,
>
> we would like to implement the following use case:
>
> 1. we have a kind of mabxml, excerpt:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <datei xmlns="http://www.ddb.de/professionell/mabxml/mabxml-1.xsd"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xsi:schemaLocation="http://files.dnb.de/standards/formate/mabxml-
> 1.xsd">
> <datensatz typ="h" status="n" mabVersion="M2.0">
> <feld nr="001" ind=" ">06978834</feld>
> ...
> <feld nr="076" ind="v">5</feld>
> <feld nr="077" ind="p">00872805<tf/>Roll, Gernot</feld>
> <feld nr="077" ind="p">00872284<tf/>Thomas, Eugen</feld>
> ...
> </datensatz>
> </datei>
>
> 2. with a filter statement we can access values for specific mab keys,
> e.g., 076v (note a pre-processing-step in our workflow is the RDFization
> of the data source (that's why the uris and rdf:value properties ;) ))
>
> <?xml version="1.1" encoding="UTF-8" standalone="no"?>
> <metamorph xmlns="http://www.culturegraph.org/metamorph"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> entityMarker=""
> version="1"
> xsi:schemaLocation="http://www.culturegraph.org/metamorph
> metamorph.xsd">
> <meta>
> <name>mapping1</name>
> </meta>
> <rules>
> <combine name="ok" value="${out}" sameEntity="true"
> reset="false">
>
> <!-- filter elements -->
> <data
>
> source="http://www.ddb.de/professionell/mabxml/mabxml-
> 1.xsd#feldhttp://www.ddb.de/professionell/mabxml/mabxml-
> 1.xsd#nr">
> <equals string="076" />
> </data>
> <data
>
> source="http://www.ddb.de/professionell/mabxml/mabxml-
> 1.xsd#feldhttp://www.ddb.de/professionell/mabxml/mabxml-
> 1.xsd#ind">
> <equals string="v" />
> </data>
>
> <!-- value of attribute path that should be
> selected for further
> processing -->
> <data name="out"
>
> source="http://www.ddb.de/professionell/mabxml/mabxml-
> 1.xsd#feldhttp://www.w3.org/1999/02/22-rdf-syntax-ns#value"
> />
>
> </combine>
> </rules>
> </metamorph>
>
> 3. with a filter statement + occurrence function we can access specific
> values for specific mab keys (until the key is a repeatable element),
> e.g., 077p (note therefore we modified the example and deleted the
> second occurrence of the 077p field)
>
> <?xml version="1.1" encoding="UTF-8" standalone="no"?>
> <metamorph xmlns="http://www.culturegraph.org/metamorph"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> entityMarker=""
> version="1"
> xsi:schemaLocation="http://www.culturegraph.org/metamorph
> metamorph.xsd">
> <meta>
> <name>mapping1</name>
> </meta>
> <rules>
> <combine name="@ok" value="${out}" sameEntity="true"
> reset="false">
> <data
>
> source="http://www.ddb.de/professionell/mabxml/mabxml-
> 1.xsd#feldhttp://www.ddb.de/professionell/mabxml/mabxml-
> 1.xsd#ind">
> <equals string="p" />
> </data>
> <data
>
> source="http://www.ddb.de/professionell/mabxml/mabxml-
> 1.xsd#feldhttp://www.ddb.de/professionell/mabxml/mabxml-
> 1.xsd#nr">
> <equals string="077" />
> </data>
>
> <data name="out"
>
> source="http://www.ddb.de/professionell/mabxml/mabxml-
> 1.xsd#feldhttp://www.w3.org/1999/02/22-rdf-syntax-ns#value"
> />
>
> </combine>
> <data name="ok2" source="@ok">
> <occurrence only="2" />
> </data>
> </rules>
> </metamorph>
>
> ====================================================
>
> So now we would like to enable this functionality also for repeatable
> elements (as the 007p key in the example above is one of). Therefore we
> need to ensure to collect all values of an entity that match the filter
> criteria (nr = 077 and ind = p) and apply the occurrence function in
> context of the entity boundary. We tried to experiment with nested
> combines (and with various variants of the parameters for reset,
> flushWith and sameEntity). However, so far without any success ... we
> always got none or all values back that match the filter criteria :\
>
> Thanks a lot in advance for all your help.
>
> Cheers,
>
>
> Thomas
> _______________________________________________
> Metafacture mailing list
> Metafacture at lists.dnb.de
> http://lists.dnb.de/mailman/listinfo/metafacture
More information about the Metafacture
mailing list