
 =     AlexisDoc > Programming > Client Server > Trees and tables


 REASONS

   Reasons for putting a table on a tree, rather than a tree on a table

   It's  the  two  standard  ways  of  arranging  information, maybe even
   thinking...

   So,  if we need to extract a hierarchy from a bunch of wine glasses on
   a table, or a collection of leaves, what do we do?

   Hierarchisise,  based on characteristics, or flatten, based on generic
   type.

   There's  obviously  no  definitive  reason  to  choose  one "principal
   arrangement of data" over the other, in any general way. Certain types
   of data and certain uses of that data lean one way or the other.

   Nevertheless, computationally, systems still lean towards the table...

   For  instance,  all  XML  repositories  are  backed  by  an  extensive
   collection  of  indexes  - all nodes in a DOM view will be "flattened"
   for rapid searching without having to actually examine the whole tree.
   Similarly  attributes,  document  types  et  al.  In  fact,  from  one
   perspective, a DOM tree is a hierarchical arrangement of seven tabluar
   systems:  Document  Type>DTD,  Element  > Text Content, Comment > Text
   Content,  Attribute  >  Text  content  and  so  on.  The  "model" type
   repositories  (as  opposed  to "native-XML" repositories) do precisely
   this: when asked for a document, they reassemble it from its component
   parts,  losing  non-XML  decoration  (whitespace)...and  in situations
   where  the  data is far more important than the original format of the
   document., this works perfectly well. The big benefit is that there is
   one  data  store  (the  XML nodes and their hierarchy) rather than two
   (XML nodes and XML documents) - the latter of course needing constant,
   resource-consuming work to keep synchronised.

   Conversely,  relational databases do not generally have a multi-tiered
   hierarchical  structure  attached to them...certain hierarchies may be
   implied  by  the  use of those tables, and by foreign key constraints,
   but these will generally be only one- or two layers deep

   A few illustrative, if off-the-topic examples:

   the success of guerilla warfare vs multi-tiered formal armies

   the fact that worldwide there are fewer public service pay grades than
   there used to be


   [icebergsTreesTables.gif]

   All  things  considered,  is  it  easier  for  computers to maintain a
   hierarchy  supported  by  a flat substructure, or constantly flatten a
   hierarchical structure?

   The  answer,  I  think,  is borne out by all practice. The former. All
   information   management   systems:   search   engines,   filesystems,
   relational  databases, XML documents, which require any form of random
   access  to their content always supply some form of flattened index to
   their  content.  In  the  case of search engines, this is taken to the
   extreme  that  they don't contain the content themselves, simply a set
   of (complicated) indexes with links to the appropriate document on the
   web.

   A canonical example is trying to search for recently modified files on
   a hard drive. Given a fast, but large hard drive, this will take about
   5  minutes.  Given a hierarchical structure on top of a flat system [a
   relational  filesystem],  this should take under 10 milliseconds. Unix
   filesystems   support   the   separation   of  file  content  and  its
   meta-information   (attributes/properties)   up  to  a  point  -  each
   file/directory has a unique identification number (inode), but ideally
   this  concept  should  be  extended  to cover all common properties of
   files, and all common properties of a given type of file.

   What would you lose?

   Absolutely  nothing.  The  question is care at design-time, to ensure,
   firstly,  that frequently-accessed meta information (file information,
   links)  is  easily  accessible and flat, and secondly that information
   requiring  large amounts of space (i.e. large images) be, if possible,
   stored  in  a  way  that  allows  most  rapid access in a standard use
   scenario.

   The second point is what hierarchical systems try and do: second guess
   the  end user and say "if you want to access several files at the same
   time, they'll probably all be in the same directory so everything will
   be  wonderful".  But  patterns  of  use vary greatly from situation to
   situation,  for instance if you defragment your hard drive when it has
   multi-tracked  music  stored  on  it,  your  audio software won't work
   because  it  expects  the  data  to  be  fragmented  in  a particular,
   use-specific form.

 THE TREE ON THE TABLE
   The cherry on the tree on the table

   None  of  the  above  is  not  to  say  hierarchies  aren't absolutely
   necessary.  But  to  be able to have different (hierarchical) views of
   the same set of knowledge - say words in a dictionary based on length,
   alphabetical order, emotive content or any other hierarchy which given
   metadata  associated  with  that  word can define, is an essential for
   human  knowledge  and  should be for computers as well. To create such
   views  with  the  envisaged  Chakriya relational filesystem will be so
   easy it will probably become a disease.

   [image  to  follow of three different hierarchical views of dictionary
   entries and their use scenarios]

   
