[Federated-fs] Draft of requirements for a federated filesystems
    Black_David at emc.com 
    Black_David at emc.com
       
    Thu Mar 22 12:22:09 PDT 2007
    
    
  
Dan,
It's a good start, but (IMHO) needs significant attention ...
I went looking for a crisp statement of the problem that is being
solved here, and it wasn't easy to find.  I think the paragraph
starting at line 69 is trying to do this, but it's not clearly
obvious what's broken, and why existing technology doesn't fix
it.  I think the rationale is roughly that between multiple
administrative domains, and the desire to federate existing
systems without ripping out and replacing the existing NSDBs,
a single NSDB is inadequate, hence the primary goal of a federated
FS is to support one or more federated namespaces, each of which
can devolve namespace resolution to multiple NSDBs (that may be
heterogeneous) for different areas of the namespace.
The definition of federation needs to be tightened up accordingly
- there's a lot of existing technology that satisfies the current
definition (e.g., automounter), which was probably not intended.
Jumping from the definitions straight into requirements leaves
the reader's head spinning.  An architectural/structural overview
of what a federation is, how the pieces fit together, and how it
functions to provide file access to clients over existing protocols
is needed.  Moving the examples and discussion section to before
the requirements would be a good start on this.
Overall, replication, migration and annotation appear to be
additions to the basic functionality of federation.  There ought
to be some rationale for why they make sense as part of the
basic specification of federation functionality.
--- More detailed comments:
Second sentence of filesystem definition looks like it escaped
from the fileset definition ;-).
Definitions of acronyms (e.g., FSL) should *always* include the
expansion of the acronym (e.g., FileSystem Location - note that
the "S" is potentially ambiguous without this expansion).
Junction key definition needs to be expanded - why is the lookup
being done?
Requirement A2 should not be stated as "oblivious" - the client
is definitely not "oblivious" to what's in the federation.  This
is really a "dynamic discovery" requirement - the client must be
able to discover the composition of the federation on the fly
without a priori knowledge of the structure of the federation.
Requirement A3 needs to be rewritten - this is *not* platform-
oblivious.  I think this is about completeness of specified
protocols/interfaces, namely that the federation functionality
is completely specified by the protocols/interfaces to be
developed as part of this effort, and has no dependence on
other protocols/interfaces beyond the "underlying standard
protocols used by the fileservers (i.e., NFS, CIFS, DNS, etc)."
> 	A4.  All fileservers in the federation MUST operate within
>         the same authentication/authorization domain.
I'm not sure, but I need a crisp definition of "authentication/
authorization domain" to understand what's going on here.  Also,
this text:
	... a shared authentication mechanism.  This mechanism is
      not defined or further described in this document.
may be in conflict with A3, depending on how "authentication/
authorization domain" is defined.
The discussion of junction key in A5 2. is incomplete, as is
the definition of junction key in the glossary.  I think a paragraph
or two is needed after the glossary that explains how a client
deals with a junction in a federation when the NSDBs for the source
and target of the junction are different.  The requirement in
A5 2. is probably correct, but I can't check it due to this
lack of explanation of how a junction key is used.
R1d and R1e send me back to A5. 1., specifically the test
saying that "a FSN MUST express, or can be used to discover ... 
the location of the NSDB ...".  The word "express" is dangerous
here, as it appears to envision a location independent name
being bound to the a specific location of a name resolution
service (i.e., the NSDB).  I think it would be a good idea
to pop up a level and talk about FSN to FSL mapping as a
"name resolution service" in the junction discussion
that's already needed.  This should explain how a name resolution
service for a geographically distributed federation is envisioned
to work and place requirements on it.  As a hint to get started,
consider why DNS names do *not* express the location of a
name resolution service.
R2 is written in terms of "directory hierarchy", not "fileset" -
why??  This has implications on fileset behavior/functionality
*as viewed from the federation* that need to be explained.
Absent this explanation, the second paragraph of R2 ("It is
the responsibility ...") may be in conflict with the first.
R2a has the "name resolution service" issue - see above
discussion of A5. 1.  In general R2a-R2c appear to dive into
rather low level details, and will need to be re-evaluated
once the "name resolution service" architecture and
requirements are nailed down.
R3b - when MUST the junction appear in all the replicas? 
The definition of replica is vague about update timing
(e.g., "unreachable" can hide a lot of bad behavior).
R4 has a *very* important implication - in the presence of
junction changes, namespace consistency across client views is
*not* guaranteed *because* a client could be wandering around
a stale area of a namespace courtesy of a junction change above
it.  I understand why this is being done, but this discussion
of possible dynamic staleness needs to be *much* earlier in
the document - somewhere like the overview that describes
what a federation is/does and what it isn't/doesn't.
R5 needs to deal with the client consequence of invalidation
of an FSN that the client is accessing.
R6 needs to deal with the NSDB and client consequences of FSL
invalidation.
R7: "Each fileset MUST NOT appear in more than one namespace."
Why is this a requirement??  Unless I've missed something, this
is very easy to violate.
R8a and R8b talk about filesystems as opposed to filesets.
That does not appear to be consistent with the use of the term
filesets elsewhere, e.g., in the definitions of fileset and
filesystem in the glossary.  What's going on here?
R9 - is it the namespace that needs to be accessible, or files
in that namespace or both?
R9a-d say that all fileservers SHOULD implement CIFS, NFSv4, NFSv3,
and NFSv2.  I predict lively discussion on this one ...
Given the escape language provided in R11 and R13, I suspect they
should be lower-case "should" requirements, not upper-case, as the
"may be possible" language is probably inconsistent with a "SHOULD".
The "MUST"s in R14a-c do not appear to be consistent with the
overall "SHOULD" in R14.  I think a distinction between "the
federation interface specification MUST specify" and
"implementations SHOULD support" is in order.  The whole
document needs to be gone over to be specific about who is
the target of each requirement (federation specification,
federation implementer, or even federation administrator).
Also on R14, annotations are potentially dangerous to
interoperability if a client looks for an annotation and only
traverses the junction if that annotation is present.  There
should be a requirement prohibiting this sort of bad behavior,
particularly on vendor- specific annotations.
N1 - Define "shadowed" - I can't parse the non-requirement as
currently stated.
N2 - Specify what the "updates and access" are to.  I agree with
the underlying concern that distributed transactions is asking
a lot from implementations (e.g., multi-phase commit).
The examples and discussion should probably be much earlier
in the document - much of the missing explanation of how
junctions work is here.
Thanks,
--David
> -----Original Message-----
> From: federated-fs-bounces at sdsc.edu 
> [mailto:federated-fs-bounces at sdsc.edu] On Behalf Of Daniel Ellard
> Sent: Monday, March 19, 2007 10:34 PM
> To: federated-fs at sdsc.edu
> Subject: Re: [Federated-fs] Draft of requirements for a 
> federated filesystems
> 
> 
> Some mailers apparently mangle the text I sent out, so I'm 
> re-sending as an
> attachment.  Hopefully between the two formats, everyone will 
> be able to
> read it properly.
> 
> In the worst case, save the attachment, rename it to 
> something that ends in
> .txt instead of .out, and then open it in your favorite browser.
> 
> -Dan
> 
> 
> On 3/19/07 3:51 PM, "Ellard, Daniel" <ellard at netapp.com> wrote:
> 
> > 
> > The following draft is submitted for review.  Our goal is 
> to jump-start
> > discussion of federated file system protocols by 
> articulating what we
> > believe are the functional requirements of such a system.  
> We welcome
> > input and discussion from everyone.
> > 
> > ...
> 
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: DistFSReqts.out
> Type: application/octet-stream
> Size: 29492 bytes
> Desc: not available
> Url : 
> https://lists.sdsc.edu/pipermail/federated-fs/attachments/2007
0319/a8766506/DistFSReqts.out 
> 
> 
    
    
More information about the Federated-fs
mailing list