Armando's Notes from WWW-5
France Telecom Talk
A big wheel from France Telecom gave a plenary address (in French; I was
even able to follow, without using the simultaneous translation headset!).
It was fluff, but at least it seems like France Telecom groks the Internet
thing, unlike some US carriers I could name. Intersting point: the Minitel
system has taught FT but good that if it ain't easy to use, it won't
get used. So they see the NC angle.
FT's position is that the telecommunicat ions pie will now be shared by
more than just a few big companies, but fortunately, the pie is getting
bigger (and it's FT's mission to encourage that growth, to preserve its
earnings). They also see that IP telephone software is forcing them into
a new business model and they don't plan to try to simply outlaw it as the
US carriers did. Their belief is that the Internet economic model will
rely on paying for QOS, not just for bits. (Will this discourage NC's and
encourage capable laptops? Or better, will it encourage NC's served by
It seems that if someone could propose a system solution for charging for
and reliably delivering QoS, they could make a lot of money. (My opinion,
Nokia Industry Talk
A clueless droid talked about the Nokia 9000 (Euro price about $2000). Amazingly,
he did not sell it as a Web device or even email device, and equally amazingly,
most questions were about the phone functionality, weight, etc.
- Didn't mention Geoworks, not even once, not even indirectly or by logo.
You'd never know they had anything to do with this. Jonathan will be glad
- Robustness is a problem: comm dropouts are frequent and the network
stack often doesn't survive them. "There is a lot yet to be done"
is how the speaker put it.
- Eric Brewer went to Gosling's Java talk and said Gosling mentioned the
challenge of putting Web pages on a small screen such as the Nokia's. I
- Did you know GSM handsets can get geographic location info from their
basestation? Provided you can get the service provider to tell you the secret
protocol for obtaining it; most consider this sensitive info.
- Packet data on GSM coming soon? Nokia's proposal (GPRS) under review
now. Will be like CDPD at 9.6/14.4Kbits.
Multimedia and the Web
Rob Glaser (pres/CEO of Real Audio) gave a talk that was better than I expected.
He understands, unlike John Patrick from IBM who spoke at the opening session,
that the bandwidth for quality A/V will not be there unless we start
doing intelligent bandwidth management; simply throwing more fiber at the
problem won't solve it.
Perhaps surprisingly, this panel session was mostly content-free. Panelists
represented British Telecom, the European Commission, a British security
consulting firm, Novell. The high order bits were:
Someone brought up the standard argument "when crypto is outlawed,
only outlaws will have crypto". Panelists were in agreement on the
response: the purpose of deciding policy and setting up crypto infrastructure
is to protect law-abiding parties who want to conduct business with integrity.
There will always be sophisticated outlaws anyway. Of course this completely
circumvents the question of private use of strong crypto, but the
panelists seemed less concerned to address that debate.
- Global crypto is hard as a policy issue, but must be resolved in order
for international commerce to proceed. Duh.
- Need to distinguish ethical arguments from actual business requirements.
Current debates are "missing the point" in this respect. E.g.
the GII (lobbying group in Europe for e-commerce) has agreed in principle
to the French government's key escrow constraints.
- Most models seem to be based around trusted third parties (TTP) which
are not necessarily outside CA's or escrow agents.
- This is still a policy question (it will be decided by groups like OECD,
not by the ITU, which is a standards organization).
An interesting tidbit: even though US export laws prohibit not only crypto
engines of certain kinds but even modular interfaces to those engines
("crypto with a hole"), Novell's strategy for a worldwide crypto-enabled
NetWare will be "controlled modular crypto": they will indeed
define a narrow modular (even layered) crypto interface, but their runtime
linker will refuse to load all but "properly signed" crypto modules,
which can be made available by Novell or partners as local legislation and
export laws permit. (The guy said "it's crypto with a hole, but a hole
that onle we (Novell) can plug.") I was stupefied and asked
the obvious question. The response was that "we are protected [from
people subverting the kernel] to the extent that our licensing agreement
already protects us from people reverse-engineering our product." When
I pressed him, he said that reverse-engineering NetWare, even to find the
crypto hooks, was "hard enough" and therefore he wasn't concerned
that they needed to de-modularize the interface. I guess Ian and Dave have
their work cut out for them.
Most of the other panelists thought NetWare's approach was misguided: it
was a technical answer (and a bad one, IMHO) to a policy question that will
eventually need to get resolved anyway. Frankly I thought Novell's talk
was the most offensively blatant marketing spiel I saw at the whole conference,
and that includes the industrial demos pavilion.
I didn't know this was such a spirited debate but evidently it is. It was
about URC's -- universal resource characteristics, metadata about URN's/URI's
which may be stored physically separately from the resources themselves.
Consistent URC information would allow high quality indexing/search, resource
discovery, cross-protocol location services (e.g. server maps URN to URC
+ set of URL's) and other wonderful things. The issue is what exactly URC's
should be. Some points/perspectives:
Metadata for the Masses
- Leslie Daigle, from a company I forget, says that searches should always
return URC's, but in the degenerate case a URC is a URN/URL. She also pointed
out that URC's should be first-class objects that can be used for arbitrary
description of "my-favorite-new-resource": one should clearly
separate the format of URC's from the particular fields etc. necessary to
do URN resolution, which is just one use of them.
- URC's are sets of name-value pairs. Remember we need a way to describe
services (eg search engines) as well as "static" resources.
- We should stop wranginglin about what a "name" is: setttle
on an empty transport standard that is general, machine parsable and encryptable,
and the content standard will emerge later. Some progres towards this was
evidently made in the Warwick Metadata Workshop, which I hadn't even heard
- Proposal: Metadata packages/standards would be developed and maintained
by professional organizations with standing in their field: libraries would
develop a bibliography format, museum curators would develop a collection-cataloging
format, etc. Plus individuals would be free to define their own formats
as well. The formats would then be registered in a type registry, as MIME
types are now.
I thought Eric would attend this, but evidently he had other stuff to do.
Panel: Internet Indexing
As a personal aside, my first-order impression of Tim Bray, both in the
content of the positions he advanced and the way in which he advanced them,
was "We are OpenText, fuck you very much for your patronage."
If there is a major crawler shakeout, I will enjoy watching him die (though
that's not necessarily realistic). Interestingly, he got an award for "Meritorious
research and interesting presentation" (he presented a paper similar
to Allison Woodruff's et al. about measuring the profiles of Web documents,
presumably using the OpenText database.)
- Tim Bray, OpenText: servers are getting free benefit from being crawled
(exposure), yet crawlers do all the work! You servers should do your share
of work! (metadata, update protocols, duplicate detection, canonical names,
I don't see how a nonprofit org. whose content has no direct commercial
value is going to be convinced of this, but Tim seemed arrogant, er, confident
that even non-commercial servers would see the value in being crawled and
make the extra effort. I agree that metadata and canonical names are a good
thing anyway, but I didn't like the way he framed the argument.
- Tim also said OpenText was considering taking money to allow a particular
site to be at the top of a results list for particular queries, "as
long as they were clearly marked as such". I thought this was appalling
and would undermine any credibility in search engines. I pointed out that
there were already advertisement-filtering proxies, and it couldn't be very
productive for anyone to just go down the path of users and advertisers
fighting each other to see who can be the more clever. The reply was that
in the long run this would produce more robust software, just as the "fight"
between the cryptographers and the crypto-breakers has produced stronger
algorithms. I think there is absolutely no parallel here, but I thank Tim
Bray for helping me choose my future search engine.
- There was a call for a "crawling consortium", to
- develop crawling standards and eliminate redundant crawls
- establish metadata standards and solve the "text-inside-gifs
- establish authenticated-crawler standards to address copyright protection
- Tim Bray thinks a crawling consortium won't work, since current services
regard their crawled data as their primary advantage, not a commodity. Of
course nobody said the crawlers had to give data away...Tim's justification
was also that "brute force" still works just fine for crawling
the whole web and doesn't waste a significant amount of bandwidth.
I was surprised there wasn't more discussion of social issues related to
Internet access, etc. Everyone agreed that the Internet had "changed
everything" -- it's a big publishing democracy, the power is being
taken from big companies and put into consumers' and individuals' hands,
voice oriented telco's and their arcane tariff structure are dinosaurs,
etc. However, there are a number of respects in which the current Internet
bears no resemblance to reality:
Interestingly, Steve McGeady (Intel Arch. Labs, the think tank/research
lab in Portland) made a "play the skeptic" presentation during
the opening session of "business on the web" that did address
the above issues. I was pleased that of all the business speakers, the representative
of my former employer seemed to have the best grip on reality.
- Billing: nobody seems to be paying except you and a couple of guys where
you work. Obviously the current billing structure(based on geographic distance
and time) won't work. What about multicast billing? (Multicasting is being
discussed in several contexts, including cooperative caching.)
- Censorship: in some countries, notably Singapore, the ISP's are basically
state run because either they are synonymous with the state-owned telco
or because only the state can make such an infrastructure investment. This
leads to a big opportunity for state censorship, as has already occurred:
certain sites and newsgroups are banned in Singapore (ISP's are held liable
for access), and in France two ISP owners were recently jailed for a similar
- Centralization by big servers/companies: small sites either go unnoticed
or get killed by being swamped with traffic when popular. Result: the big
sites get bigger, centralizing information dissmeniation in the hands of
a few large companies. Isn't this deja vu all over again?
- Bandwidth: IMHO the IBM guy, John Patrick, did a tremendous amount of
damage by talking about all the great things that are going to be delivered
to you in real time via the WWW and then blithely saying "the Net can
deal with it" (his exact words describing the bandwidth situation).
We have a bandwidth problemnow. Adding more bandwidth isn't going to solve
it unless we also start managing it;just ask any city that has spent millions
to add another lane to an existing busy highway. The technical community
seems to appreciate this. The business droids do not necessarily, and that's
disconcerting. (I noticed that during the separate "business on the
web" track, there seemed to be no mention of this issue.)
These came to me at various times in various contexts
- Can we leverage RTP for the proxy, esp. for audio?
- If Daedalus will support QoS admission, etc., will it support RSVP?
- Yet another progressive-encoding streaming format: Microsoft ActiveMovie.
Part of the ActiveX API, which is interesting--complete fusion of the "desktop
metaphor" and browser. ActiveX is the "cross product" of
OLE and WWW; the next version of Internet Explorer has it completely integrated,
and you can have folder/file hierarchies on your desktop open to WWW sites
and vice versa. Nothing revolutionary except a standard set of API's (object
framework of course) to glue it all together.
- We should write up a 1-page list of "recommended hardware support
for PDA's" (e.g. audio, encryption), with refs. to the projects we
did that resulted in the recommendations.
- Various WWW servers have peak hours that are easily predicted. This
should be exploitable for proxying/caching.
- Long-term, what role will executable content play? Will it be more
like Java applets, or more like HTML generation on the fly to construct
custom pages for users? (There may be a market for the proxy among the
latter: content generation on the fly, customized not only to the user but
to the machine type and network connection.)
- How to distinguish "types" of images (redball.gif vs. a "real"
picture of something)? This would win BIG in the proxy.
- Internationalizing the web: is a language-translation proxy completely
unrealistic? What's the state of the art in translation of written text?