yak
[Top] [All Lists]

[yak@collab] Re: Design Tool, Version 0.2

To: yak@xxxxxxxxxxxxxxxxxxx
From: John Sechrest <sechrest@xxxxxxxxxxxx>
Date: Wed, 11 Jan 2006 08:37:01 -0800
Message-id: <E1Ewiy5-0000T1-36@jas.peak.org>


"M.Altheim" <M.Altheim@open.ac.uk> writes:    (01)

 % > I am not using tuples that way.    (02)

 % I was using "node" in the typical graph theory way. Nodes
 % and arcs, etc.  In that sense, a node is always a leaf,
 % with relations expressed by the arcs. Some people call
 % arcs "edges", etc. but the essentials are the same.    (03)

 Yes. I see.     (04)

 I am using node to name the "fundamental object". 
 And I am thinking that attributes of nodes are what is 
 graphed. IE, a node might contain a reference to other nodes.
 And it is this relationship that would be graphed. Since
 nodes may have different attributes that can all be graphed,
 you can switch from view to view, showing different graphs of 
 different relationship of nodes.     (05)

 % I'm not sure where the use of "node" comes from here, but
 % conflating "node" and "tuple" is confusing to me, since
 % the former is classically a leaf in a graph, the latter
 % is a finite sequence of objects considered as a set. The
 % thing you've described above to me would be called a
 % "triple", which is similar to (an abstract superset of)
 % the core structure found in RDF.    (06)

 Yes, I am using triples as a focus for the process. 
 Each "node" then is collections of triples.     (07)

 And putting the triples into databases directly is like
 indexing every attribute of the node. So we have some
 storage/representation of a node itself. And we have a perhaps
 different storage mechanism for the triples. Perhaps
 some things don't need to be in databases. Like images. 
 And so nodes that contain images may instead contain
 indexes of references to images, instead of images directly.     (08)

 And so... While I might have a backup file which represents a node
 on the file system, by representation of the node in the database
 is just serialized  triples (with perhaps a few more tags)     (09)

 So... the storage mechanism is focused around triples and 
 indexes.     (010)

 The key to my understanding for this was that there is a mapping
 from database tables of the normal type into a serialized table.     (011)

 And so as an example:    (012)

 People often think of nodes as objects and then they 
 create tables to represent them in SQL ala:    (013)

create table node (
 nodeid int primary key not null auto_increment,
 title varchar(60),
 author varchar(60),
 nodeimage varchar(255),
 image2 varchar(255),
 body text,
 gravity int,
 issue varchar(20),
 keywords varchar(255),
 subsection varchar(60)
);    (014)

And then to get things from this, I use things like:    (015)

 select @fields from $TABLE where
 $field[1]=$condition1,$field[2]=condition2
 order by $sortfield, limit $row,$row+$size    (016)


However, this runs us into all kinds of problems with coding practices
and with flexibility and scalability over time as the types of 
attributes change and shift. We would like to have arbitrary relationships    (017)

(Note: I see RDF, and XML as syntaxes to transfer and communicate nodes, but I 
don't see them as a storage system. )    (018)

However, if instead, I represent the table as:    (019)


 create table triple (
  fieldid int primary key not null auto_increment,
  nodeid int,
  key varchar(60),
  value longtext
  );    (020)

Then I am able to get access to the same data, using a transformed query:    (021)

Instead of saying:     (022)

SELECT artid,title,author FROM node;    (023)

I can say:     (024)


SELECT T1.value, T2.value, T3.value
FROM Triple as T1, Triple as T2, Triple as T3
WHERE T1.nodeid = T2.nodeid and T2.nodeid = T3.nodeid and
      T1.key = "artid" and T2.key = "title" and T3.key = "author"    (025)


And so when I create a view of my nodes as a table, I might
have seen:    (026)


artid   title   author  aimage  body    gravity issue   keyword  ...
1       test me me      x.jpg   stuff   1       Mar04   test1    ...
2       Stuff   you     y.jpg   zip     2       Apr04   test1    ...
3       Things  fred    z.jpg   zap     30      Jun04   test1    ...
4       Dance   sue     102.jpg stuff   3       Mar04   test1    ...    (027)

But the triple storage method would look like:    (028)

Triple:
fieldid nodeid  key     value
1       1       artid   1
2       1       title   test me
3       1       author  me
4       1       aimage  x.jpg
5       1       body    stuff
6       1       gravity 1
7       1       issue   Mar04
8       1       keyword  test1
9       2       artid   2
10      2       title   Stuff
...    (029)



In this way, all possible attributes are storable, searchable without
any code changes. We can add attributes as needed.    (030)

If we add on the top of this a network access mechanism, then we 
get the foundation to drive this process to the level that it can be
useful for this process that Eric is pointing at.     (031)

It will need to have an indentification/authorization process 
for it to work as it needs to , but ignoring that at the moment,    (032)


Suppose that we use http: as our transport mechanism.
If I can say http://site.com/triple/$attribute/$value
and have that return a list of data. Then I think we have the 
foundation in place.    (033)

For example:    (034)

http://site.com/triple/nodeid/10
would return all of the triples for node number 10.    (035)

Nodeid: 10
Author: job bob
Date: 2006/1/9
body: This is a line of test    (036)


<Key: Node id's need to have the global attribute that puple numbers need.
      They much be globally unique, in order for things to mesh in a 
      shared environment>    (037)

This comes close to the suggestion of using SQL to query things directly.
In the sense that we are forcing a query into narrow structure.    (038)

SELECT * FROM triple where nodeid=10;    (039)

is the equilent query.    (040)


I will have to think harder on how to represent templated queries which
generate subsets of node relations as tables.    (041)

Like:     (042)

How do I represent this:     (043)

select author, date, title from node where author="joe bob", order by date,
limit 10,20;    (044)


I am still thinking out loud here.     (045)

And so I am at some level assuming a database storage system, with the ability
to look at that database from the outside as xml, or as text or in other more
complicated ways.     (046)

And so a larger document may end up being many multiple nodes. And it will
be the query that we do that generates that document and the view of that
document.     (047)


(I am sorry for rambling like this, but I needed to get it down in one place)    (048)


...
 % Yes, this is true using either your language or mine. XNode is
 % not used in any way, shape or form as a display mechanism. It
 % is, like NODAL or ReiserFS, just a storage format that permits
 % metadata to be stored along with the object. SOAP is similar
 % in this regard.    (049)

 % But in looking at what you're talking about, yes, we seem
 % to be using "node" in different ways. I would have to say
 % that I think XNode and NODAL do not have a fundamental
 % disagreement in terminology, as I was in conversation with
 % Lee during the early days of XNode and so far as I know I'm
 % just using typical graph theory terminology.    (050)

 % It sounds like (to me) you are using "node" as a term for
 % tuples, or collections of tuples (which would also be tuples).    (051)

 Yes, I think that collections of tuples(triples) may be correct.     (052)

 It occures to me that the "body" of a node could be thought of 
 as a paragraph or a letter or a chapter , in normal document processing
 work. And yet, the choice of which one you choose greatly alters how
 things work.    (053)

 In terms of this IBIS dialog process, I think that a "node"
 needs to represent the core component. The Question, the goal, the 
 alternative. And this may end up with many attributes, but all the things
 about question #134 need to be under the one node in some way.     (054)





 % > I am thinking more low level and more simply. But I do have
 % > an interest in making this low level system be able to be=20
 % > distributed. And so I am still thinking about the API at the network
 % > level for how nodes and tuples talk to each other in a distributed
 % > space.     (055)

 % Well, XNode is operating at the storage level, which in my=20
 % architecture is almost as low as one can get. In my first
 % implementations I was using the Apache Xindice XML database
 % as the backing store, and prior to later versions it only
 % worked in a client-server manner. So, while Ceryle currently
 % uses an embedded database, for its first two years or so it
 % was operating as a client-server system, and I build the
 % XNode API to permit appropriate listeners and messaging so
 % that the client could wake up and connect with the remote
 % database. It all worked fine. I've since been able to dump a
 % lot of that machinery (which was rather complex compared=20
 % with the needs of an embedded database that is always present),
 % and I'm happy to have lightened the code base accordingly. I
 % will be happy to make available the earlier API versions if
 % people wanted that, though off the top of my head I think
 % it's still backwards compatible at the API level -- only
 % the implementation has been simplified.    (056)

 Help me understand XNODE. What is the thirty second elevator speech
 for XNODE?    (057)






-----
John Sechrest          .         Helping people use
                        .           computers and the Internet
                          .            more effectively
                             .                      
                                 .       Internet: sechrest@peak.org
                                      .   
                                              . http://www.peak.org/~sechrest    (058)

-- 
This message is archived at:    (059)

http://collab.blueoxen.net/forums/cgi-bin/mesg.cgi?a=yak&i=E1Ewiy5-0000T1-36@jas.peak.org    (060)
<Prev in Thread] Current Thread [Next in Thread>