[Metafacture] Proposal for Metafacture module package structure

Christoph Böhme christoph at b3e.net
Fri Nov 4 07:12:15 CET 2016

Hi Philipp and all,

Am Mi. Nov. 2 10:15:24 2016 GMT+0100 schrieb Philipp von Böselager:
> I'm not deep enough into the framework to really judge your modules 
> separation. But it looks very good to me.

It's quite helpful to have someone looking at the structure who isn't
too deep into the framework. Your views are probably much less
influenced  by the technicalities of the framework. After all, in the
end  the separation of the modules  should be easy to understand without
knowing too much about the framework's internals.

> One thing I've realized is that the rdf, jdom, json and csv packages 
> are quite small by now. We could consider summarizing them as 
> "dataformats" maybe. Or will they grow soon?

I don't think these packages will grow a lot. My idea for putting the
modules in different packages was mainly based on their  dependencies.
Json, jdom and csv require third party libraries. So, in order to be
able to keep the number of dependencies in applications using
Metafacture small, I thought it would be good, to put the module
packages which require external libraries  in independent  packages. I'm
not sure how important it is for others to keep the number of deps low.

> Concerning your "json" vs. "jackson" question: It depends on our 
> intention. Do we want to implement modules for other json frameworks
> or not? As perhaps nobody would answer that question with a
> definitive "no", we might go for "jackson".

What I quite like about the generic name is that it abstracts from the
concrete implementation. In my view from a user's perspective it doesn't
really matter in most situations whether we use jackson, moxy or our own
parser for processing json. By using a generic name we can designate a
standard Metafacture package for working with json. When someone wants a
json parser based on a specific library, she can still create a package
and name it after the library. Which really makes sense in this case as
the library is part of the contract that the package offers. Hence, I
opt for a generic name. Are there other opinions?

> Following the question of how to further separate the core-modules,
> I had a look at two approaches:
> 1. separation by function type 2. separation by data type
> (See attached lists.) While 1. obviously results in too many 
> subgroups and has a lack of relevance, 2. indicates a possible way
> to go. There are options to summarize entries of this list, like
> * pojos and objects * records and triples * collections (lists, maps 
> and sets) * ...

The separation by function feels similar to the currently used structure
of converters, pipes, sinks and sources only with more details. Markus
and I were never really happy with the functional package scheme as it
felt somehow arbitrary and did not really help to find modules again. By
having more finely grained packages this might become easier but I still
see problems such as what is the difference between a Converter and a

The data type based grouping looks much better. In particular if we'd
group those data types further as you suggested. That should it also
make it easier to find modules as one usually has an idea what type of
data one is working with.
Another thought I had for separating the modules into core and others
was based on their use by other modules. For example, StreamBuffer and
StreamFlattener are used internally by a number of other modules. These
modules are a kind of building blocks for other modules. So, I think it
might make sense to put these modules in a core-modules  package. This
package could contain the modules other modules depend on as highlighted
in my original list.

> By the way: the separation by data type made me aware of "XML" not 
> being a separate module so far (like json, jdom, csv and rdf). What 
> is the reason for it?

One reason for keeping the xml modules in the core was that they have no
external dependencies. The other was that the XmlDecoder in particular
is a rather central module as all XmlHandlers require this module as well.


More information about the Metafacture mailing list