Directory Server, Version 6.1
Appendix C. LDAP data interchange format (LDIF)
This documentation describes the LDAP Data Interchange Format (LDIF), as used by the idsldapmodify, idsldapsearch, and idsldapadd utilities.The LDIF specified here is also supported by the server utilities provided with the IBM® Directory.
LDIF is used to represent LDAP entries in text form. The basic form of an LDIF entry is:
dn: <distinguished name> <attrtype> : <attrvalue> <attrtype> : <attrvalue> ...A line can be continued by starting the next line with a single space or tab character, for example:
dn: cn=John E Doe, o=University of Higher Learning, c=USMultiple attribute values are specified on separate lines, for example:
cn: John E Doe cn: John DoeIf an <attrvalue> contains a non-US-ASCII character, or begins with a space or a colon ':', the <attrtype> is followed by a double colon and the value is encoded in base-64 notation. For example, the value " begins with a space" would be encoded like this:
cn:: IGJlZ2lucyB3aXRoIGEgc3BhY2U=Multiple entries within the same LDIF file are separated by a blank line. Multiple blank lines are considered a logical end-of-file.
LDIF example
Here is an example of an LDIF file containing three entries.
dn: cn=John E Doe, o=University of High er Learning, c=US cn: John E Doe cn: John Doe objectclass: person sn: Doe dn: cn=Bjorn L Doe, o=University of High er Learning, c=US cn: Bjorn L Doe cn: Bjorn Doe objectclass: person sn: Doe dn: cn=Jennifer K. Doe, o=University of High er Learning, c=US cn: Jennifer K. Doe cn: Jennifer Doe objectclass: person sn: Doe jpegPhoto:: /9j/4AAQSkZJRgABAAAAAQABAAD/2wBDABALD A4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQ ERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2P/2wBDARESEhgVG ...The jpegPhoto in Jennifer Doe's entry is encoded using base-64. The textual attribute values can also be specified in base-64 format. However, if this is the case, the base-64 encoding must be in the code page of the wire format for the protocol (that is, for LDAP V2, the IA5 character set and for LDAP V3, the UTF-8 encoding).
Version 1 LDIF support
The client utilities (idsldapmodify and idsldapadd) have been enhanced to recognize the latest version of LDIF, which is identified by the presence of the "version: 1" tag at the head of the file. Unlike the original version of LDIF, the newer version of LDIF supports attribute values represented in UTF-8 (instead of the very limited US-ASCII).
However, manual creation of an LDIF file containing UTF-8 values may be difficult. In order to simplify this process, a charset extension to the LDIF format is supported. This extension allows an IANA character set name to be specified in the header of the LDIF file (along with the version number). A limited set of the IANA character sets are supported. See IANA character sets supported by platform for the specific charset values that are supported for each operating system platform.
The version 1 LDIF format also supports file URLs. This provides a more flexible way to define a file specification. File URLs take the following form:
attribute:< file:///path (where path syntax depends on platform)For example, the following are valid file Web addresses:
jpegphoto:< file:///d:\temp\photos\myphoto.jpg (DOS/Windows style paths) jpegphoto:< file:///etc/temp/photos/myphoto.jpg (UNIX or Linux style paths)Note:The IBM Directory utilities support both the new file URL specification as well as the older style (e.g. "jpegphoto: /etc/temp/myphoto"), regardless of the version specification. In other words, the new file URL format can be used without adding the version tag to your LDIF files.
Version 1 LDIF examples
We can use the optional charset tag so that the utilities will automatically convert from the specified character set to UTF-8 as in the following example:
version: 1 charset: ISO-8859-1 dn: cn=Juan Griego, o=University of New Mexico, c=US cn: Juan Griego sn: Griego description:: V2hhdCBhIGNhcmVmdWwgcmVhZGVyIHlvd title: Associate Dean title: [title in Spanish] jpegPhoto:< file:///usr/local/photos/jgriego.jpgIn this instance, all values following an attribute name and a single colon are translated from the ISO-8859-1 character set to UTF-8. Values following an attribute name and a double colon (such as description:: V2hhdCBhIGNhcm... ) must be base-64 encoded, and are expected to be either binary or UTF-8 character strings. Values read from a file, such as the jpegPhoto attribute specified by the Web address in the previous example, are also expected to be either binary or UTF-8. No translation from the specified "charset" to UTF-8 is done on those values.
In this example of an LDIF file without the charset tag, content is expected to be in UTF-8, or base-64 encoded UTF-8, or base-64 encoded binary data:
# IBM Directorysample LDIF file # # The suffix "o=sample" should be defined before attempting to load # this data. version: 1 dn: o=sample objectclass: top objectclass: organization o: IBM dn: ou=Austin, o=sample ou: Austin objectclass: organizationalUnit seealso: cn=Linda Carlesberg, ou=Austin, o=sampleThis same file could be used without the version: 1 header information, as in previous releases of the IBM Directory:
# IBM Directorysample LDIF file # # The suffix "o=sample" should be defined before attempting to load # this data. dn: o=sample objectclass: top objectclass: organization o: IBM dn: ou=Austin, o=sample ou: Austin objectclass: organizationalUnit seealso: cn=Linda Carlesberg, ou=Austin, o=sampleNote:The textual attribute values can be specified in base-64 format.
IANA character sets supported by platform
The following table defines the set of IANA-defined character sets that can be defined for the charset tag in a Version 1 LDIF file, on a per-platform basis. The value in the left-most column defines the text string that can be assigned to the charset tag. An "X" indicates that conversion from the specified charset to UTF-8 is supported for the associated platform, and that all string content in the LDIF file is assumed to be represented in the specified charset. "n/a" indicates that the conversion is not supported for the associated platform.
String content is defined to be all attribute values that follow an attribute name and a single colon.
See IANA Character Sets for more information about IANA-registered character sets. Go to:
http://www.iana.org/assignments/character-sets
Table 31. IANA-defined character sets Character Locale DB2® Code Page Set Name HP-UX Linux®, Linux_390, NT AIX® Solaris UNIX® NT ISO-8859-1 X X X X X 819 1252 ISO-8859-2 X X X X X 912 1250 ISO-8859-5 X X X X X 915 1251 ISO-8859-6 X X X X X 1089 1256 ISO-8859-7 X X X X X 813 1253 ISO-8859-8 X X X X X 916 1255 ISO-8859-9 X X X X X 920 1254 ISO-8859–15 X n/a X X X IBM437 n/a n/a X n/a n/a 437 437 IBM850 n/a n/a X X n/a 850 850 IBM852 n/a n/a X n/a n/a 852 852 IBM857 n/a n/a X n/a n/a 857 857 IBM862 n/a n/a X n/a n/a 862 862 IBM864 n/a n/a X n/a n/a 864 864 IBM866 n/a n/a X n/a n/a 866 866 IBM869 n/a n/a X n/a n/a 869 869 IBM1250 n/a n/a X n/a n/a IBM1251 n/a n/a X n/a n/a IBM1253 n/a n/a X n/a n/a IBM1254 n/a n/a X n/a n/a IBM1255 n/a n/a X n/a n/a IBM1256 n/a n/a X n/a n/a TIS-620 n/a n/a X X n/a 874 874 EUC-JP X X n/a X X 954 n/a EUC-KR n/a n/a n/a X X* 970 n/a EUC-CN n/a n/a n/a X X 1383 n/a EUC-TW X n/a n/a X X 964 n/a Shift-JIS n/a X X X X 932 943 KSC n/a n/a X n/a n/a n/a 949 GBK n/a n/a X X n/a 1386 1386 Big5 X n/a X X X 950 950 GB18030 n/a X X X X HP15CN X (with non-GB18030) Notes:
- The new Chinese character set standard (GB18030) is supported with appropriate patches available from www.sun.com and www.microsoft.com
- On the Windows® 2000 operating system, set the environment variable zhCNGB18030=TRUE.
[ Top of Page | Previous Page | Next Page | Contents | Index ]