[Metafacture] Proposal for Metafacture module package structure
Christoph Böhme
christoph at b3e.net
Mon Oct 31 18:10:03 CET 2016
Hi all,
attached you find a proposal for separating the Metafacture modules
which are currently in metafacture-core into different packages.
I am not really happy with this current version of the proposal as the
metafacture-core-modules package contains way to many modules for my
taste. Perhaps you have some ideas how we could further split up these
modules. I think, it is desirable to have a small number of modules in
the core-modules package as a number of other packages use modules from
this package. Hence, it would be good if it were rather stable. That is
much easier if it does not contain many modules.
Another open question is the naming of the Metafacture module packages.
Should they using generic names or implementation specific ones? For
instance, should the package that contains the modules for processing
JSON with the jackson library be named metafacture-json-modules or
metafacture-jackson-modules? The former would hide the implementation
detail of which library was used to do the actual reading and writing.
However, it is no longer possible to offer to packages for json
processing which are based on different implementations. The latter
naming scheme would allow this.
Regarding the general progress of the reorganisation. All pull requests
for metafacture-core are now merged. At the moment I am updating all
dependencies of the core to their latest versions. Afterwards, I want to
determine what needs to go into the Metamorph API package. That is
hopefully not to difficult as Metamorph is already quite independent
from the rest of Metafacture. Only the xml-based test cases may cause
some trouble. Once this is done, I thought of reorganising the package
structure to reflect the new multi-module structure, so that the actual
split up in the next release becomes easier. This last all-in-one
version of Metafacture will be released as metafacture-core 4.0.0
Cheers,
Christoph
-------------- next part --------------
metafacture-framework
Contains the interface definitions and base classes for Metafacture modules.
Stability: high
Packages:
framework
--------------------------------------------------------------------------------
metafacture-morph-api
Contains the interface definitions and base classes for implementing
Metamorph functions and collectors.
Stability: high
Contents:
[TODO] (Collector and Funtion interfaces, AbstractFlushingCollector ...)
--------------------------------------------------------------------------------
metafacture-commons
Contains utility classes and commonly used data structures.
Stability: medium
Packages:
exceptions
types
util
--------------------------------------------------------------------------------
metafacture-test-tools
An extension for JUnit that allows to test Metamorph scripts using JUnit.
Stability: medium
Packages:
test
Modules:
WellformednessChecker
StreamValidator [DEPENDS ON: metafacture-core-modules (EventList.Event)]
--------------------------------------------------------------------------------
metafacture-core-modules
Stability: medium
Packages:
formeta [DEPENDS ON: metafacture-commons (StringUtil.copyToBuffer)]
Modules:
FormetaEncoder [DEPENDS ON: formeta]
FormetaDecoder [DEPENDS ON: formeta]
FormetaRecordsReader [DEPENDS ON: formeta]
ObjectToLiteral
LiteralToObject
StreamToTriples [DEPENDS ON: formeta]
TriplesToStream [DEPENDS ON: formeta]
StringListMapToStream [DEPENDS ON: metafacture-commons (ListMap)]
MapToStream
CloseSuppressor
IdChangePipe
IdentityStreamPipe
LineReader
RecordReader
RegexDecoder
ObjectTemplate [DEPENDS ON: metafacture-commons (StringUtil.format)]
PojoEncoder
PojoDecoder
PreambleEpilogueAdder
StreamLiteralFormatter
SortedTripleFileFacade
AbstractTripleSort
TripleSort
TripleCount
TripleCollect [DEPENDS ON: formeta]
AbstractStreamBatcher
StreamBatchLogger [DEPENDS ON: metafacture-commons (StringUtil.format)]
StreamBatchResetter
DuplicateObjectFilter
LineSplitter
ObjectBatchLogger [DEPENDS ON: metafacture-commons (StringUtil.format)]
ObjectBuffer
ObjectExceptionCatcher
ObjectLogger
ObjectTee
ObjectTimer [DEPENDS ON: metafacture-commons (TimeUtil.formatDuration)]
RecordToEntity
StreamBatchMerger
StreamBuffer
StreamDeferrer
StreamEventDiscarder
StreamExceptionCatcher
StreamFlattener
StreamLogger
StreamMerger
StreamTee
StreamTimer [DEPENDS ON: metafacture-commons (TimeUtil.formatDuration)]
StringDecoder
StringFilter
StringMatcher
TripleFilter
TripleReorder
EntityPathTracker
EventList
StringConcatenator
NamedValueList [DEPENDS ON: metafacture-commons (Collector, NamedValue)]
NamedValueSet [DEPENDS ON: metafacture-commons (Collector, NamedValue)]
SingleValue [DEPENDS ON: metafacture-commons (Collector)]
StringListMap [DEPENDS ON: metafacture-commons (Collector, ListMap)]
StringMap [DEPENDS ON: metafacture-commons (Collector)]
ValueSet [DEPENDS ON: metafacture-commons (Collector)]
HttpOpener
ResourceOpener [DEPENDS ON: metafacture-commons (ResourceUtil.getReader)]
DirReader
StdInOpener
StringReader
StringSender
XmlDecoder
GenericXmlHandler
CGXmlHandler
SimpleXmlEncoder [DEPENDS ON: metafacture-commons (MultiMap, ResourceUtil.loadProperties)]
XmlTee
XmlElementSplitter [DEPENDS ON: org.apache.commons.lang]
--------------------------------------------------------------------------------
metafacture-extra-modules
Stability: low
Modules:
ObjectPipeDecoupler
StreamUnicodeNormalizer
UnicodeNormalizer
JScriptObjectPipe [DEPENDS ON: metafacture-commons (ResourceUtil.getReader)]
--------------------------------------------------------------------------------
metafacture-morph-modules
Contains Metamorph and modules directly building on it.
Stability: medium
Depends on: metafacture-comons
Modules:
Metamorph [ÐEPENDS ON: metafacture-core-modules (StreamBuffer, StreamFlattener), org.apache.commons.lang]
Filter [DEPENDS ON: metafacture-core-modules (Metamorph, StreamBuffer, SingleValue)]
Splitter
--------------------------------------------------------------------------------
metafacture-stats-modules
Stability: low
Modules:
AbstractCountProcessor
CooccurrenceMetricCalculator
Counter
UniformSampler
Histogram
--------------------------------------------------------------------------------
metafacture-file-modules
Modules for reading and writing files.
Stability: low
Depends on: org.apache.commons.io
org.apache.commons.compress
Modules:
FileOpener
TarReader
TripleObjectRetriever
TripleObjectWriter
TripleReader
TripleWriter
FileDigestCalculator
ConfigurableObjectWriter
AbstractObjectWriter
ObjectStdoutWriter
ObjectFileWriter
ObjectWriter
XmlFilenameWriter
--------------------------------------------------------------------------------
metafacture-biblio-modules
Modules for working with library related data formats.
Stability: low
Packages:
iso2709 [DEPENDS: metafacture-commons (Require, StringUtil.repeatChars)]
Modules:
AseqDecoder
MabDecoder
Marc21Encoder [DEPENDS ON: iso2709]
Marc21Decoder [DEPENDS ON: iso2709]
PicaEncoder
PicaDecoder [DEPENDS ON: metafacture-commons (StringUtil.copyToBuffer)]
MarcXmlHandler
PicaXmlHandler
AlephMabXmlHandler
OreAggregationAdder [DEPENDS ON: metafacture-commons (ListMap, ResourceUtil.loadProperties)]
PicaMultiscriptRemodeler [DEPENDS ON: metafacture-core-modules (StreamBuffer)]
--------------------------------------------------------------------------------
metafacture-rdf-modules
Stability: low
Depends on: org.apache.commons.lang
Modules:
RdfMacroPipe
--------------------------------------------------------------------------------
metafacture-csv-modules
Stability: low
Depends on: net.sf.opencsv
Modules:
CsvDecoder
--------------------------------------------------------------------------------
metafacture-jdom-modules
Stability: low
Depends on: org.jdom
metafacture-commons
Modules:
StreamToJDomDocument
JDomDocumentToStream
--------------------------------------------------------------------------------
metafacture-json-modules
Stability: low
Depends on: org,apache.commons.lang
com.fasterxml.jackson
Modules
JsonEncoder
JsonToElasticsearchBulk
More information about the Metafacture
mailing list