From ellard at netapp.com  Tue May  1 18:55:11 2007
From: ellard at netapp.com (Daniel Ellard)
Date: Tue, 01 May 2007 21:55:11 -0400
Subject: [Federated-fs] Discussion of FSN resolution: what is required vs
	what is expected
Message-ID: <C25D67BF.7284%ellard@tahoe.netapp.com>


A few points related to FSN resolution came up in the conference call, so I
think it's a good idea to discuss the current draft requirements and
different methods that can satisfy those requirements.

The first requirement is that an FSN must convey two pieces of information:

1. The NSDB that "owns" the information about the FSN.

2. The "junction key" that identifies the fileset and is used by the "owner"
NSDB as the key for information about that fileset.

This leads to the simple resolution mechanism outlined in the current draft:
to find the FSLs for an an FSN, use the FSN to find the NSDB, and then ask
the NSDB for the FSLs for the junction key represented by the FSN.

Given the current requirements and definition, this resolution method MUST
work.  However, there is no requirement that this is the resolution method
that must be used!  This method is the fallback when all else fails
(although for a simple implementation, it could be the only method).

For example, in a clustered system, all servers in the cluster might funnel
their requests through a local proxy NSDB that caches the responses from the
owner NSDBs.  If two servers request the same resolution within a short
period of time, then the first might have to go through to the owner NSDB,
but the second could be answered from the cache.  Cache entries would be
marked stale and invalidated after a given period of time elapsed.  A
callback could be registered at the owning NSDB for FSNs of local interest
(i.e., the FSNs present in junctions within the section of the namespace
implemented by the server), if that turns out to work better than polling.

There are many solutions to this kind of problem, but I won't claim to know
enough about the parameters of the system -- how many nodes, how fast the
FSN bindings change, etc -- to pick the winner.  I don't think we'll know
for sure until we actually build one, but we need to make something flexible
enough that the real behavior of the system can emerge.  We don't want
something that admins only use in limited ways because those are the only
ways that the system works!

One solution that doesn't satisfy the current criteria is to use a
hierarchical system, similar to DNS.  The resolution method starts with a
local NSDB (or similar service) and escalates up through the hierarchy until
the value is found, and then the results are cached along this path.  The
problem with this approach is not technical -- we know how to do this.  The
problem is that I don't see a way to map a hierarchy onto a federation of
peers.  The federation member(s) that hosts the root(s) of the hierarchy
have an asymmetric relationship with the other members.  Instead of a
junction being a relationship between two federation members, now additional
federation members may be required in order to traverse a junction.

You may note the arguable contradiction (or hypocrisy) that I am opposed to
using intra-federation DNS-like resolution for FSNs, but I am perfectly
happy assuming that the system uses DNS to perform many federation-oriented
tasks, such as finding NSDBs or naming the servers on which filesets are
implemented.  This is because the DNS framework is a superset of any
federation -- the root servers (and probably most of the ancestors on the
way to the root) are independent of and/or outside of the federation.

Please add your thoughts...

What do we need to require in order to make a flexible resolution protocol
possible?

Thanks,
    -Dan

From Black_David at emc.com  Wed May  2 09:37:54 2007
From: Black_David at emc.com (Black_David at emc.com)
Date: Wed, 2 May 2007 12:37:54 -0400
Subject: [Federated-fs] Discussion of FSN resolution: what is required
	vswhat is expected
In-Reply-To: <C25D67BF.7284%ellard@tahoe.netapp.com>
References: <C25D67BF.7284%ellard@tahoe.netapp.com>
Message-ID: <F222151D3323874393F83102D614E055068B9378@CORPUSMX20A.corp.emc.com>

Dan,

As one of the people who had things to say about this, I think my
primary
concern is that the functionality provided by the NSDB should be
architected
and viewed as a "service" at this stage of the work, so as not to
preclude
distributed name resolution infrastructures, even though the easiest and
most obvious implementation is to just go ask the responsible NSDB for
the
resolution.  We definitely need to allow for multiple name resolution
"services" for the same reason that multiple NSDBs are allowed.

Beyond this, I don't have strong disagreements with most of what Dan
wrote.

Thanks,
--David

> A few points related to FSN resolution came up in the conference call,
so I
> think it's a good idea to discuss the current draft requirements and
> different methods that can satisfy those requirements.
> 
> The first requirement is that an FSN must convey two pieces 
> of information:
> 
> 1. The NSDB that "owns" the information about the FSN.
> 
> 2. The "junction key" that identifies the fileset and is used by the
"owner"
> NSDB as the key for information about that fileset.
> 
> This leads to the simple resolution mechanism outlined in the current
draft:
> to find the FSLs for an FSN, use the FSN to find the NSDB, and then
ask
> the NSDB for the FSLs for the junction key represented by the FSN.
> 
> Given the current requirements and definition, this resolution method
MUST
> work.  However, there is no requirement that this is the resolution
method
> that must be used!  This method is the fallback when all else fails
> (although for a simple implementation, it could be the only method).
> 
> For example, in a clustered system, all servers in the cluster might
funnel
> their requests through a local proxy NSDB that caches the responses
from the
> owner NSDBs.  If two servers request the same resolution within a
short
> period of time, then the first might have to go through to the owner
NSDB,
> but the second could be answered from the cache.  Cache entries would
be
> marked stale and invalidated after a given period of time elapsed.  A
> callback could be registered at the owning NSDB for FSNs of local
interest
> (i.e., the FSNs present in junctions within the section of the
namespace
> implemented by the server), if that turns out to work better than
polling.
> 
> There are many solutions to this kind of problem, but I won't claim to
know
> enough about the parameters of the system -- how many nodes, how fast
the
> FSN bindings change, etc -- to pick the winner.  I don't think we'll
know
> for sure until we actually build one, but we need to make something
flexible
> enough that the real behavior of the system can emerge.  We don't want
> something that admins only use in limited ways because those are the
only
> ways that the system works!
> 
> One solution that doesn't satisfy the current criteria is to use a
> hierarchical system, similar to DNS.  The resolution method starts
with a
> local NSDB (or similar service) and escalates up through the hierarchy
until
> the value is found, and then the results are cached along this path.
The
> problem with this approach is not technical -- we know how to do this.
The
> problem is that I don't see a way to map a hierarchy onto a federation
of
> peers.  The federation member(s) that hosts the root(s) of the
hierarchy
> have an asymmetric relationship with the other members.  Instead of a
> junction being a relationship between two federation members, now
additional
> federation members may be required in order to traverse a junction.
> 
> You may note the arguable contradiction (or hypocrisy) that I am
opposed to
> using intra-federation DNS-like resolution for FSNs, but I am
perfectly
> happy assuming that the system uses DNS to perform many
federation-oriented
> tasks, such as finding NSDBs or naming the servers on which filesets
are
> implemented.  This is because the DNS framework is a superset of any
> federation -- the root servers (and probably most of the ancestors on
the
> way to the root) are independent of and/or outside of the federation.
> 
> Please add your thoughts...
> 
> What do we need to require in order to make a flexible resolution
protocol
> possible?
> 
> Thanks,
>     -Dan
> 
> 

From Daniel.Ellard at netapp.com  Wed May  2 16:44:45 2007
From: Daniel.Ellard at netapp.com (Daniel Ellard)
Date: Wed, 02 May 2007 16:44:45 -0700
Subject: [Federated-fs] Discussion of FSN resolution: what is required
 vswhat is expected
In-Reply-To: <F222151D3323874393F83102D614E055068B9378@CORPUSMX20A.corp.emc.com>
Message-ID: <C25E707D.5A69%Daniel.Ellard@netapp.com>


Yes, sorry I forgot to note that.

The NSDB is a service.  The name is perhaps somewhat misleading (but it's
better than half a dozen names we came up with first...).

I think it's likely that we will need another term here, to distinguish the
name resolution service from the administration service for the individual
NSDBs.  When performing resolution, maybe it doesn't matter which NSDB
instance you ask -- it will either know the answer, or route your request
along to someone who does,  When performing admin tasks, I think it's more
likely that it does matter -- in this case it makes sense to talk directly
to the NSDB that owns the FSN.

The reason I think that there is a distinction here is because of the
different security models for the different kinds of queries.  It might be
OK for everyone to do resolution (and everyone else to know about it), but
admin operations might be more private and protected in a different manner.

What do you think?

Thanks,
    -Dan


On 5/2/07 9:37 AM, "Black_David at emc.com" <Black_David at emc.com> wrote:

> Dan,
> 
> As one of the people who had things to say about this, I think my
> primary
> concern is that the functionality provided by the NSDB should be
> architected
> and viewed as a "service" at this stage of the work, so as not to
> preclude
> distributed name resolution infrastructures, even though the easiest and
> most obvious implementation is to just go ask the responsible NSDB for
> the
> resolution.  We definitely need to allow for multiple name resolution
> "services" for the same reason that multiple NSDBs are allowed.
> 
> Beyond this, I don't have strong disagreements with most of what Dan
> wrote.
> 
> Thanks,
> --David

From Black_David at emc.com  Wed May  2 22:52:53 2007
From: Black_David at emc.com (Black_David at emc.com)
Date: Thu, 3 May 2007 01:52:53 -0400
Subject: [Federated-fs] Discussion of FSN resolution: what is required
	vswhat is expected
In-Reply-To: <C25E707D.5A69%Daniel.Ellard@netapp.com>
References: <F222151D3323874393F83102D614E055068B9378@CORPUSMX20A.corp.emc.com>
	<C25E707D.5A69%Daniel.Ellard@netapp.com>
Message-ID: <F222151D3323874393F83102D614E055068B9386@CORPUSMX20A.corp.emc.com>

> Yes, sorry I forgot to note that.
> 
> The NSDB is a service.  The name is perhaps somewhat misleading (but
it's
> better than half a dozen names we came up with first...).
> 
> I think it's likely that we will need another term here, to
distinguish the
> name resolution service from the administration service for the
individual
> NSDBs.  When performing resolution, maybe it doesn't matter which NSDB
> instance you ask -- it will either know the answer, or route your
request
> along to someone who does,  When performing admin tasks, I think it's
more
> likely that it does matter -- in this case it makes sense to talk
directly
> to the NSDB that owns the FSN.

Lets try FNRS - F(S)N Resolution Service.  The discussion of admin tasks
makes sense - one wants to talk directly to the responsible NSDB that
can
then export the appropriate update into an FNRS that may span multiple
NSDBs.

> The reason I think that there is a distinction here is because of the
> different security models for the different kinds of queries. It might
be
> OK for everyone to do resolution (and everyone else to know about it),
but
> admin operations might be more private and protected in a 
> different manner.
> 
> What do you think?

Makes sense.  We need to make sure not to inadvertently exclude a
single-NSDB-scoped FNRS, as that's a fairly obvious simple
implementation.

Thanks,
--David
----------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953             FAX: +1 (508) 293-7786
black_david at emc.com        Mobile: +1 (978) 394-7754
----------------------------------------------------

From ellard at netapp.com  Thu May 10 18:56:38 2007
From: ellard at netapp.com (Daniel Ellard)
Date: Thu, 10 May 2007 21:56:38 -0400
Subject: [Federated-fs] New internet draft regarding federated file systems
Message-ID: <C2694598.74A9%ellard@tahoe.netapp.com>


I've just submitted the attached file as an internet draft.  It is a
proposed requirements document for federated file systems, and its purpose
is to generate discussion and build consensus around the requirements for
federated file systems (which we'll then use as a basis for actually
building such systems...).

This will all sound familiar to people on the federated-fs list, but maybe
not to people only on the nfsv4 mailing list.

This draft, which has been reformatted in the style of an RFC, is the fourth
draft of this document, but the numbering has been reset in accordance with
the IETF draft protocol.  The major differences between this draft and
earlier drafts are:

- Reformatting

- Dividing the "NSDB" concept into two related concepts: the "NSDB service"
(an abstract service, implemented by a set of nodes) and the "NSDB node"
(one of the nodes that implements a specific part of the NSDB service).

There are also many rewordings, error fixes, and similar minor changes
suggested by reviewers via email or on the last conference call.

-Dan

-------------- next part --------------
A non-text attachment was scrubbed...
Name: draft-ellard-federated-fs-00.txt
Type: application/octet-stream
Size: 48618 bytes
Desc: not available
Url : https://lists.sdsc.edu/pipermail/federated-fs/attachments/20070510/e5356aa9/draft-ellard-federated-fs-00.txt 

From ellard at netapp.com  Thu May 24 13:09:52 2007
From: ellard at netapp.com (Ellard, Daniel)
Date: Thu, 24 May 2007 16:09:52 -0400
Subject: [Federated-fs] Conference calls to discuss the federated-fs
	requirements draft (and eventually the protocol)
Message-ID: <C98692FD98048C41885E0B0FACD9DFB8044BA981@exnane01.hq.netapp.com>


The calls will be every second Monday, starting June 11, from 4-5pm
Eastern time.

Next week, due to the holiday, the call will take place on Tuesday 4-5pm
Eastern time.

The dial-in number is 888-765-3653 conf id 6206056.

-Dan

From ellard at netapp.com  Tue May 29 17:28:26 2007
From: ellard at netapp.com (Daniel Ellard)
Date: Tue, 29 May 2007 20:28:26 -0400
Subject: [Federated-fs] Notes from the conference call, 5/29/2007
Message-ID: <C2823D6A.79F1%ellard@tahoe.netapp.com>


My notes from the call this afternoon, 5/29/2007.

Attendees:  Daniel Ellard (NetApp), Craig Everhart (NetApp), David Ford
(NetApp), Paul LeMahieu (EMC), David Black (EMC), Renu Tewari (IBM
Research), Manoj Naik (IBM Research).  [If I missed anyone, please let me
know!]

Notes:

- in the RFC2119 boilerplate, we should convey the idea that in many places
in the document where these terms are used they imply a requirement of any
protocol that satisfies the requirements.  This is a requirements document,
not a specification, and so there's a level of indirection here that we
haven't quite captured.

If anyone knows how to convince xml2rfc to emit different prose for this
boilerplate, or has ideas how to make this clearer (or an example RFC that
has the different language) please let me know.

- The sections on "Security Considerations" and "IANA Requirements" are at
the beginning of this draft, but in the usual draft form, they come at the
end.  Was this intentional, and if so, what does it signify?

It wasn't intentional, and it signifies that I have more to learn about
xml2rfc.  I don't know why the sections were ordered in this way, but I will
figure out how to put them in the canonical order.  All future drafts will
have these sections in the usual order.

- The security considerations need to be rewritten to address attacks
against the protocol, not attacks against the system.  RFC3552 should be
used as a guide to the correct form and language to use.

- Attacks against the protocol (particularly attempts to hijack a session
and redirect it somewhere else) can be considered irrelevant if the protocol
is based on secure connections such as provided by TLS or RPC-GSS.  So
mandating that the protocol employ mechanisms such as these avoids a lot of
headaches with security.

At this point discussion dipped down into some details of different secure
connection mechanisms and their merits.  Useful discussion, but perhaps
premature, given that we're still working on the requirements, so I cut this
short.  Once we have settled on the requirements, we can discuss how to
implement them.  We might have to iterate, if we find ourselves with a set
of requirements that can't be satisfied by any known mechanisms and have to
compromise on the requirements in order to get something working.  (I don't
think we want to invent new mechanisms, unless there is no other solution.)

- In section 5, the concept of filesets should be explained in more detail.
Otherwise the reader might jump to the wrong conclusions and be confused
later.  Most readers won't jump ahead to the glossary.

- There is concern that section 5 is not clear enough to be understood by
people who don't already understand the issue.  Unfortunately, everyone on
the call understands the issues, so it's hard to figure out exactly how this
should be explained to the uninitiated.  Suggestions are more than welcome.

- One suggestion is to use examples, perhaps such as the automounter and
DFS, of things that attempt to do the sorts of things we want, but fall
short.  One of the distinctions is that the namespace and the filesystem are
woven together -- instead of the top-level of the namespace being defined
outside of the filesystem (i.e., automounter maps, that live in their own
space, from which hang file systems), the junctions are embedded in the file
system.

Suggestions for wording welcome.  I'm not sure I captured all the ideas.

- In section 6, we should lay out the success criteria -- when do we know
when the protocol is finished?  [probably just allude to the requirements
and assumptions that follow?]

- Section 7.2 could use some ASCII art to show the junction resolution.

- Perhaps 7.3 could use some ASCII art as well (not as necessary).

- Could use some ASCII art in the motivation to illustrate a federation.

IF ANYONE KNOWS OF A GOOD TOOL FOR CREATING ASCII DIAGRAMS, PLEASE LET ME
KNOW.  I have a feeling that a good diagram of a federation with several
servers, NSDB nodes, and clients may expand the frontier of 72-column ASCII
art.

- In R2, the concept of "promotion" must be made more clear.  [perhaps this
term needs to be in the glossary]

- R9: the question came up of whether a referral can redirect to a different
protocol as well as a different server (i.e., "the fileset you're looking
for is over there, but it only speaks NFSv3, so even though you're a CIFS
client, please deal with it somehow.").  Consensus was no.  The client
should assume that uses the same protocol throughout.  The NSDBs don't keep
track of what filesets are accessible via what protocols.  It may be that
some clients can't see some filesets because they're only accessible via a
subset of the protocols.  It is the responsibility of the admins to enforce
an accessibility policy that satisfies their clients.

- Should we have "views", where different clients see different parts of --
or different implementations of objects within -- the namespace, perhaps
keyed on client ID or user ID?  Consensus was no.  We don't have a mechanism
for this in the underlying protocols, and trying to layer this on top of
protocols like NFS (assuming that this is even possible -- it's not obvious)
is far beyond the scope of what we want to do right now.

Comments, remarks, etc?

The next conference call will be Monday, June 11, at 4pm Eastern time.  I'll
have a revised draft by the previous Wednesday (if not earlier).

888-765-3653 conf id 6206056

-Dan