So I am working on a project, and I do a little investigation down a particular path, start writing some code, and we decide not to go that route.
Throwaway code, fine. But, it is actually something I made small improvements on based on someone else's shared code. So I figured contribute it rather than totally throw it away.
PHP UTF-16 to UTF-8 conversion based on source from http://www.moddular.org/log/utf16-to-utf 8 (which was based on some JavaScript code... also potentially useful.):
Lets say you have something in UTF-16. PHP is not very UTF-16 friendly yet (STILL).
Based on the code from moddular, I decided being able to defect for BOM would be good so I pulled that into a separate function. I just use 'be' and 'le' to indicate big and little endian (and '' to indicate neither), but one could define constants instead.
Also, my little bit of research indicated that BOM is optional in UTF-16, and that big endian is generally the safer of the two assumptions. I extended the original method to be configurable, but retain the default assumption that no BOM means the string is not UTF-16. If you force conversion, the default endian is big, but can be overridden to little ('le').
Have fun.
Throwaway code, fine. But, it is actually something I made small improvements on based on someone else's shared code. So I figured contribute it rather than totally throw it away.
PHP UTF-16 to UTF-8 conversion based on source from http://www.moddular.org/log/utf16-to-utf
Lets say you have something in UTF-16. PHP is not very UTF-16 friendly yet (STILL).
Based on the code from moddular, I decided being able to defect for BOM would be good so I pulled that into a separate function. I just use 'be' and 'le' to indicate big and little endian (and '' to indicate neither), but one could define constants instead.
Also, my little bit of research indicated that BOM is optional in UTF-16, and that big endian is generally the safer of the two assumptions. I extended the original method to be configurable, but retain the default assumption that no BOM means the string is not UTF-16. If you force conversion, the default endian is big, but can be overridden to little ('le').
Have fun.
<?php
function getUtf16Bom($str)
{
$c0 = ord($str[0]);
$c1 = ord($str[1]);
return(
$c0 == 0xFE && $c1 == 0xFF
? 'be'
: (
$c0 == 0xFF && $c1 == 0xFE
? 'le'
: ''
)
);
}
function utf16ToUtf8($str,$forceConversion = false, $assumedEndian = 'be') {
$type = getUtf16Bom($str);
if('' == $type && !$forceConversion){
return $str;
}
$str = substr($str, 2);
$len = strlen($str);
$dec = '';
for ($i = 0; $i < $len; $i += 2) {
$c =
('be' == $type || '' == $type && 'be' == $assumedEndian)
? ord($str[$i]) << 8 | ord($str[$i + 1])
: ord($str[$i + 1]) << 8 | ord($str[$i]);
if ($c >= 0x0001 && $c <= 0x007F) {
$dec .= chr($c);
} else if ($c > 0x07FF) {
$dec .= chr(0xE0 | (($c >> 12) & 0x0F));
$dec .= chr(0x80 | (($c >> 6) & 0x3F));
$dec .= chr(0x80 | (($c >> 0) & 0x3F));
} else {
$dec .= chr(0xC0 | (($c >> 6) & 0x1F));
$dec .= chr(0x80 | (($c >> 0) & 0x3F));
}
}
return $dec;
}
//lets say we want to open a UFT-16 Spreadsheet exported from Excel...
$dataFile = utf16ToUtf8(file_get_contents('config/sfxdata.txt'));
//split it into lines...
$dataList = split("\n",$dataFile);
$dataTable = array();
foreach ($dataList as $lineNumber=>$dataLine){
//split each line into cells...
$dataTable[] = split("\t",$dataLine);
}
print_r($dataTable);The technology this morning was uncooperative. Unfortunately my demo hinged on a locally hosted web server, and I couldn't get the laptop to output video to the projector.
So now the demo is uploaded to my old Engineering space (because they support PHP) and here is the content of the slides:
Slides
Transcript:
My name is Troy Hurteau, I am the Interface and Applications Development Specialist for NCSU Libraries.
This presentation was a small part of a larger panel on the current landscape and future movement of accessibility on the web.
So, when we talk about sharing information on the web there are several things to consider from the personal perspective (thinking non-technically).
The visitors accessing your site or application come from a variety of geographical and cultural environments . They also have a diverse set of capabilities: language, vision, education, and computing platform to name a few.
Accessibility impacts how these visitors will be able to interact with or through the web. Depending on how your site or application are built, they may have challenges with authoring contributed content, reading or finding content, understanding the information you are trying to convey. These can all affect the quality of the experience then encounter.
Rich interfaces can be impressive, and very satisfying when well designed. So long as they don't create unnecessary barriers to the information, using them is a good thing, even if they benefit some users more than others.
One consideration is making the act of contributing, collaborating, and generally interacting with other people online accessible. A fairly common example of this is the rich-text editors used in web mail, internet forums, blogs, and content management systems.
If the buttons, images, and form elements that make up the text-entry interface are all marked up properly, and keyboard navigation of the interface is possible then there usually won't be a problem. Even just providing a "source view" can make the task workable for most users.
When the needs involve complex markup, the outcomes may not be comparable. The goal is to make the experiences as positive as possible. Often even a reasonable effort will rise far above what most sites attempt.
It is also important that the information authored in these systems conform to best practices for markup. Proper use of headers, paragraphs, lists, and other tags helps users with assistive technology navigate within the page.
Most developers know about things like alt text for images, but there are many markup standards that can be applied.
The link in this slide (http://people.engr.ncsu.edu/jthurtea/acc ess08/) illustrates two very similar pages. The first example uses tags correctly, the second uses the wrong tags in almost every aspect. It is styled to look identical, but the second page is significantly harder to use with a screen reader. It also would not perform as well with search engines.
Aside from the application of tags, there are other authoring practices, such as descriptive link text with a title attribute as a backup that can make a huge difference. Generally speaking links like "More" and "Click Here" are bad form. When this type of link label is unavoidable, a title attribute on the link is an acceptable substitute.
From another angle, some of the more recent developments in web technology have enabled whole new ways to share information that go beyond basic text displays and inaccessible images that require alternative text.
Sharing data through the web is a huge area of opportunity.
The thing to keep in mind is that while rich web interfaces may be cool and effective for a portion of the web population, there are users that will have problems using such approaches for a number of technical, situational, and personal reasons.
If harnessed poorly, this new technology is little different that the <blink> tag, or table based layouts. Misapplied technology causes more issues than it solves.
These three examples explore the same set of data through different methodologies.
The first (http://people.engr.ncsu.edu/jthurtea/acc ess08/3.php) is just a table rendering of the data. It is properly marked up, though since it is four dimensional data there are opportunities to apply even more useful header hierarchies.
The visual approach (http://people.engr.ncsu.edu/jthurtea/acc ess08/4.php) pulls the data in to a Google visualization application, in Flash. This is an attractive way to display the data, but if it is the only way to access the data there is a missed opportunity and not just for users with disabilities.
Any methods that give more access to the data improve the application of the information to a variety of uses. The third example (http://people.engr.ncsu.edu/jthurtea/acc ess08/5.php) shows how a simple script can be used to customize the data view. This isn't even using AJAX, though it just as easily could.
So now the demo is uploaded to my old Engineering space (because they support PHP) and here is the content of the slides:
Slides
Transcript:
My name is Troy Hurteau, I am the Interface and Applications Development Specialist for NCSU Libraries.
This presentation was a small part of a larger panel on the current landscape and future movement of accessibility on the web.
So, when we talk about sharing information on the web there are several things to consider from the personal perspective (thinking non-technically).
The visitors accessing your site or application come from a variety of geographical and cultural environments . They also have a diverse set of capabilities: language, vision, education, and computing platform to name a few.
Accessibility impacts how these visitors will be able to interact with or through the web. Depending on how your site or application are built, they may have challenges with authoring contributed content, reading or finding content, understanding the information you are trying to convey. These can all affect the quality of the experience then encounter.
Rich interfaces can be impressive, and very satisfying when well designed. So long as they don't create unnecessary barriers to the information, using them is a good thing, even if they benefit some users more than others.
One consideration is making the act of contributing, collaborating, and generally interacting with other people online accessible. A fairly common example of this is the rich-text editors used in web mail, internet forums, blogs, and content management systems.
If the buttons, images, and form elements that make up the text-entry interface are all marked up properly, and keyboard navigation of the interface is possible then there usually won't be a problem. Even just providing a "source view" can make the task workable for most users.
When the needs involve complex markup, the outcomes may not be comparable. The goal is to make the experiences as positive as possible. Often even a reasonable effort will rise far above what most sites attempt.
It is also important that the information authored in these systems conform to best practices for markup. Proper use of headers, paragraphs, lists, and other tags helps users with assistive technology navigate within the page.
Most developers know about things like alt text for images, but there are many markup standards that can be applied.
The link in this slide (http://people.engr.ncsu.edu/jthurtea/acc
Aside from the application of tags, there are other authoring practices, such as descriptive link text with a title attribute as a backup that can make a huge difference. Generally speaking links like "More" and "Click Here" are bad form. When this type of link label is unavoidable, a title attribute on the link is an acceptable substitute.
From another angle, some of the more recent developments in web technology have enabled whole new ways to share information that go beyond basic text displays and inaccessible images that require alternative text.
Sharing data through the web is a huge area of opportunity.
The thing to keep in mind is that while rich web interfaces may be cool and effective for a portion of the web population, there are users that will have problems using such approaches for a number of technical, situational, and personal reasons.
If harnessed poorly, this new technology is little different that the <blink> tag, or table based layouts. Misapplied technology causes more issues than it solves.
These three examples explore the same set of data through different methodologies.
The first (http://people.engr.ncsu.edu/jthurtea/acc
The visual approach (http://people.engr.ncsu.edu/jthurtea/acc
Any methods that give more access to the data improve the application of the information to a variety of uses. The third example (http://people.engr.ncsu.edu/jthurtea/acc
This is the latest update in my ever shifting mesh of projects:
Work Projects (development time) -
Journey is my top priority at work for October. Journey Version 1 ships at the end of the month and should handle all data on people and positions at that point. Ad-hoc/Committee group management is high on the wish list for Version 2. Version 1 includes a full migration of LDAP data from the Netscape LDAP to the AD.
Still working on Telecom Request forms. Goal is to have a demoable version ready the week of October 6th. A fully functional version won't be ready until Journey Version 1 is finished, so tentatively Version 1 would be in November.
Planning Technical Services/Remedy Phase 2 now. I think "Phase 2" will be "Version 1" for that project. Does not currently hinge on Journey. No time table has been set yet.
I have a few Jiras for Reserves Direct bug fixes that are supposed to be "easy".
All other projects are in "support time mode".
Journey Version 0.5 and Spam protection for Forms have both been folded into a personal project called Quepie.
Personal Projects -
Still trying to get enough time to push the TwitterRdf for Sysnews into a version 0.9. Wanted to have it done Monday. Maybe by end of week? Unfortunately personal development time continues to be short. This would be a stable/clean version that runs manually, 1.0 would be the cronable version.
Want to get back to SAF/E and Flora, but so far these remain too pie-in-the-sky.
Quepie is a newer project to offload some work development time. It is a set of Zend Framework extensions including everything from my LDAP adapter (can search and write to LDAP, unlike the built-in Zend_Ldap), to utility objects, to an MVC alternative framework.
Work Projects (development time) -
Journey is my top priority at work for October. Journey Version 1 ships at the end of the month and should handle all data on people and positions at that point. Ad-hoc/Committee group management is high on the wish list for Version 2. Version 1 includes a full migration of LDAP data from the Netscape LDAP to the AD.
Still working on Telecom Request forms. Goal is to have a demoable version ready the week of October 6th. A fully functional version won't be ready until Journey Version 1 is finished, so tentatively Version 1 would be in November.
Planning Technical Services/Remedy Phase 2 now. I think "Phase 2" will be "Version 1" for that project. Does not currently hinge on Journey. No time table has been set yet.
I have a few Jiras for Reserves Direct bug fixes that are supposed to be "easy".
All other projects are in "support time mode".
Journey Version 0.5 and Spam protection for Forms have both been folded into a personal project called Quepie.
Personal Projects -
Still trying to get enough time to push the TwitterRdf for Sysnews into a version 0.9. Wanted to have it done Monday. Maybe by end of week? Unfortunately personal development time continues to be short. This would be a stable/clean version that runs manually, 1.0 would be the cronable version.
Want to get back to SAF/E and Flora, but so far these remain too pie-in-the-sky.
Quepie is a newer project to offload some work development time. It is a set of Zend Framework extensions including everything from my LDAP adapter (can search and write to LDAP, unlike the built-in Zend_Ldap), to utility objects, to an MVC alternative framework.
