[Federated-fs] Conf call 6/5/2008
Renu Tewari
tewarir at us.ibm.com
Wed Jun 11 14:45:56 PDT 2008
So we are raising a couple of important questions in this discussion.
I) Does the fed-fs protocol enable the management of a common namespace
across a set of distributed multi-vendor fileservers?
II) Does the NSDB maintain state related to the namespace especially
junction information?
If we are not supporting (I) then we need to step back and state what the
goal of fed-fs really is. There is not much value add in going through
with a new protocol that doesn't do much.
Related to (I) and the point that Paul raised we had this section on a
"root fileset" which is the top of the common namespace. (We had removed
the root fileset and finding root fileserver stuff from the draft for
the initial version just to make it simple for starters so it may need to
be revisited).
This root fileset information can be stored at one NSDB (in the degenerate
case) or all NSDB's that wish to be part of the common namespace.
The FSLs that instantiate the root fileset are located at the root
fileservers. ONLY the leaves of the root fileset are junction points where
the target FSNs map to FSLs that are physically located at some fileserver.
The path in the root fileset are logical and are used to organize the top
level of the tree. There are no junctions at the intermediate points in
the root fileset.
For example
/fedfs/com/div/A
/fedfs/com/div/B
/fedfs/com/div/C
Here only A,B,C are junction points.
The junction is the tuple < FSN_PARENT=ROOTFSN, FSN_TARGET=FSN_A,
PARENT_PATH=fedfs/com/div/A >
The FSN_A can in turn can contain a junction down the tree. This is not
part of the root fileset.
This leads to point (II) which was the discussion we had last week.
In the current state of the schema the junction information represented by
the tuple < FSN_PARENT, FSN_TARGET, PARENT_PATH > is not stored anywhere.
It is neither in the NSDB nor at the fileserver. The fileserver only has
some state marking a given directory in the parent fileset to point to the
target FSN or however else it implements junctions. The fileserver knows
nothing about filesets.
So my main concerns are:
--- Using the current schema the junctions are not really managed by the
fed-fs protocol as you cannot move the junction or delete it or query it or
know where it exists. It depends solely on how the fileserver implements
junctions and what it can tell an admin about it.
---The relationship between filesets is not available to an admin through
the NSDB so it has no idea how the namespace is organized.
--- When an FSL is created we need to rely on an external replication
protocol that will also replicate the junction information. If this
protocol is not a part of fed-fs we cannot rely on it to work cross
vendor. To depend on this undefined replication protocol for the basic
functioning of the federation bothers me.
As I understand from Dan's comments he had 3 issues on storing the
junction information in the NSDB. #1 was related to storing the
parent_path information of the junction in the NSDB. The main issue being
what happens on a rename of any of the path components.
One way is not to store the path information but only < FSN_PARENT,
FSN_TARGET> in the NSDB. This will let an admin know how the namespace is
laid out in terms of filesets but not paths. This still leaves open how to
delete a junction. Another way to do it have the fileserver return an
opaque handle that identifies the junction location instead of path. The
NSDB can store it as the tuple
< FSN_PARENT, FSN_TARGET, opaque_handle> and use it to manage the
junction. This would require the fileserver participation. Other possible
ideas..
Dan's #2 was a non issue. The target FSN's fileserver knows nothing about
where it hanging from in the namespace.
For #3 I am not sure why this relates to storing junction information in
the NSDB. It is related to how the export path is related to the common
namespace. Maybe I did not get that issue.
The problem of storing paths also exists in storing FSL paths in the NSDB.
Am I mising something there?
regards
renu
Paul Lemahieu
<LeMahieu_Paul at em
c.com> To
Sent by: "Everhart, Craig"
federated-fs-boun <Craig.Everhart at netapp.com>
ces at sdsc.edu cc
federated-fs at sdsc.edu, "Ellard,
Daniel" <Daniel.Ellard at netapp.com>
06/11/2008 10:40 Subject
AM Re: [Federated-fs] Conf call
6/5/2008
As for why, it's to have a global namespace that spans heterogeneous
file servers, and have a common configuration/management for that. A
Linux box or AIX or a NAS vendor could all be exposing the same /fedfs/
home, and handing out referrals to users.
--Paul
On 2008-Jun-11, at 10:23, Everhart, Craig wrote:
> OK: why? Also, what do we do with the hard questions (synchronizing
> multiple data sources, doing permissions correctly)?
>
>
> ________________________________
>
> From: LeMahieu, Paul [mailto:LeMahieu_Paul at emc.com]
> Sent: Wednesday, June 11, 2008 1:19 PM
> To: Everhart, Craig; Robert Thurlow
> Cc: Ellard, Daniel; federated-fs at sdsc.edu
> Subject: Re: [Federated-fs] Conf call 6/5/2008
>
>
> Yes, it's independent of anything with DNS. When I say
"top-of-
> tree", I'm talking about storing all the configuration of the
> namespace (tens of thousands of junction points and their paths in
> the logical namespace). For example, if there is /fedfs/home/bob,
> there is an entry mapping /fedfs/home/bob to bob's share on some
> physical file server. We'd be storing a pseudo file system
> representing the top-of tree in the NSDB.
>
> --Paul
>
>
> On 08/6/10 20:11, "Everhart, Craig"
<Craig.Everhart at netapp.com>
> wrote:
>
>
>
> Is this "top of tree" thought independent of the
DNS-based
> lookup? Why
> wouldn't that simply *be* the top level, leading
to, as you say,
> real
> file systems? Is this an intermediate level?
>
> I can read all your text substituting "DNS" for
"NSDB" and get a
> working
> (and specified and nearly existing) result. What
am I missing?
> Do I
> need to invent yet another replicated global
service?
>
> Craig
>
> > -----Original Message-----
> > From: Paul Lemahieu
[mailto:LeMahieu_Paul at emc.com]
> > Sent: Tuesday, June 10, 2008 7:53 PM
> > To: Robert Thurlow
> > Cc: Everhart, Craig; federated-fs at sdsc.edu;
Ellard, Daniel
> > Subject: Re: [Federated-fs] Conf call 6/5/2008
> >
> > Robert,
> >
> > This is very much the motivation. It's a couple
of key things:
> >
> > * A standard to facilitate the
administration of a
> > federated global namespace
> > * The ability to federate different file
servers so they
> > all expose the same global namespace
> >
> > A few things about how I would see this:
> >
> > * I always thought of this as a
"top-of-tree" namespace.
> > In other words, at some point the global
namespace ends and
> > you hit real file systems. and the global
namespace ends.
> > Perhaps it is possible to manage junction points
at arbitrary
> > locations in the global tree, I just hadn't
really considered it.
> > * Changes to the global namespace are made
by
> > administrators, and made via the NSDB. The
global namespace
> > is not frequently changing.
> > Participating file servers reflect those changes
later in a
> > loosely- coupled manner.
> > * This does not invalidate the existing
federated-fs
> > work. It would be essentially an additional
database in the
> > NSDB, mapping paths to FSNs.
> > * It is assumed that the NSDB takes care of
replicating this
> data.
> >
> > The big difference from what you describe below
and my view
> > is whether we have an NSDB that reflects the
namespace
> > created on the file server (the query model you
describe
> > below), or whether the NSDB is authoritative for
the
> > namespace and the file servers reflect that
namespace
> > configuration (my description, where the NSDB is
authoritative).
> >
> > --Paul
> >
> > On 2008-Jun-09, at 15:36, Robert Thurlow wrote:
> > > Everhart, Craig wrote:
> > >> I *totally* agree with Dan's bias on this.
It's a surprise
> to me
> > >> that others thought that there was a
"namespace" that existed
> > >> independently of the file systems that make
it up.
> > >>
> > >> What is the relationship between path data
that exists both
> in a
> > >> fileset (in a file server--Dan's #1 case) and
in the NSDB
> > and (as in
> > >> Paul's addendum to Dan's #3) all the
instances of the parent
> path
> > >> data that are subsidiary filesets? Is there
some
> > authoritative copy
> > >> with the others just hints? What is the
replication protocol
> by
> > >> which path data is propagated? How
consistent does it have
> to be?
> > >>
> > >> If we were stumbling thinking about the
constraints on fileset
> > >> replication and consistency, why would this
be a simple answer?
> > >> Right
> > >> now it's an unmotivated, underspecified, and
unconstrained
> > additional
> > >> criterion for any implementation. (And
them's its good
> > points...:-})
> > >>
> > >> If others (Paul? Renu?) feel like this needs
to be changed,
> could
> > >> they do it with a more fleshed-out proposal?
> > >
> > > I don't want to keep a copy of all of the
mapping data (with or
> > > without authority) in the NSDB, either. But I
do wonder if the
> > > motivation is something like this:
> > >
> > > I want an admin application that can show me
the whole
> > namespace. I
> > > want to be able to browse it like a Google
map, going up/down,
> > > left/right in the global directory tree. I'd
like to see
> > which nodes
> > > are junctions, and to be able to click on them
and see the
> details
> > > about their replicas. In the fullness of
time, I'd like to
> > be able to
> > > click on a point in the namespace and select
an option to make
> that
> > > point a separate, replicated filesystem, and
to adjust the
> > > characteristics of existing filesystems.
> > >
> > > Now, if all we have is a way to drill down
from the top of the
> > > namespace for any particular path, this could
be bloody
> > hard. Having
> > > more information would help. An alternate
question is,
> > where could we
> > > get more info?
> > >
> > > One suggestion is that the protocol permit a
query to an NFS
> server
> > > participating in the namespace, to list the
junctions in a
> > particular
> > > filesystem. If we could start at the top,
find the Nth level
> > > filesystems and ask them what junctions point
outside of the
> > > filesystem, we could enumerate the namespace
far quicker. We
> would
> > > still have to send packets all over the place
- "find the
> root" DNS
> > > queries, NSDB lookups, NFS accesses and
whatever we use to
> > enumerate
> > > junctions in a filesystem - but I have always
expected that.
> > >
> > > Does this shed light or kick up dust? :-)
> > >
> > > Rob T
> > >
> >
> >
>
>
>
>
More information about the Federated-fs
mailing list