[sldev] Looking at I18N formatting standards
Alissa Sabre
alissa_sabre at yahoo.co.jp
Thu Feb 19 06:56:51 PST 2009
> The I18N dev team is going to be tackling date, time, number, and
> currency localization issues in the next couple of quarters.
Wow, LL has I18N dev team, at last! I fully support the activity.
> We are
> looking at existing standards for replacing text inside a message and
> want to cover as many as possible before making a decision. Some
> possibilities that we are looking at include ICU
Do you want to create a large catalog? I'm not sure it's helpful...
Your example (ICU) combines locale-dependent time/date formating and
parameter substitution in message template. Java has similar set of
APIs. (Note that ICU and Java I18N features are designed by the same
group.) Microsoft .NET exposes yet another set of similar features
through System.String.Format methods. (In .NET, all of namespace
names, class names, and method names begin with a capical letter, by
the way.)
I don't know any other _standards_ with similar features.
However, as far as I understand, usual method to do this kind of job
today is that first format the data into appropriate locale dependent
string, then embed the string in a message string, substituting some
marker in the template message text. I.e., two separate set of APIs
for two separate things. For example, a standard C function for
date/time for the first part is strftime().
In most of stand-alone application csses, the part of a program that
fills message *knows* the data types of each parameters to be embedded
into messsges. The ideaq is, for example, in your example code, a
programmer who writes sdargs["date"] or Calendar.getNow() should
already know that particular data is a date, so there should be no
difficalty to call strftime requesting "normal date format" before
passing it to a substitution function.
There are so many standard set of APIs for this purpose, i.e., one set
for single data formatting, and another set for parameter substitution
in message template. (As Steve wrote partly.) As a result, there are
no many standard way to express parameter substitution and time/date
format control in a same string.
I personally prefer this traditional approach more than the
Java/ICU/.NET's _modern_ approach. My primary concern is the chance
of error detection at compile time. In ICU model, all parameters are
passed as a same data type, and the type matching is done at run time.
I don't like it.
# I'm afraid that my preference is biased by some old days experiences...
> I just don't
> want to invent a string substitution standard if there is something
> useful and widely used out there that won't make the translators' job
> more difficult.
I believe the most widely used parameter substitution syntax ever is
that of so-called XPG printf format. It is a standard printf-style
format plus argument reordering. For example,
message = "At %1$s on %2$s, there was %2$s on planet %0$d\n"
printf(message, time, date, event, planet);
It is a real _classic_ that exists since late 1980s, and it is hardly
readable as standard printf format string is, since it is intended to
be upward compatible with the standard printf.
If you include similar features in script languages, I believe the
most widely used one is the parameter/variable substitution of posix
Bourne shell. E.g.,
MESSAGE='At ${TIME} on ${DATE}, there was ${EVENT} on planet ${PLANET}'
eval echo "$MESSAGE"
--------------------------------------
Power up the Internet with Yahoo! Toolbar.
http://pr.mail.yahoo.co.jp/toolbar/
More information about the SLDev
mailing list