<?xml version="1.0" encoding="utf-8"?>
<feed xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom">
  <title>Art of Coding</title>
  <link rel="alternate" type="text/html" href="http://www.artofcoding.net/blog/" />
  <link rel="self" href="http://www.artofcoding.net/blog/SyndicationService.asmx/GetAtom" />
  <icon>favicon.ico</icon>
  <updated>2008-04-19T20:34:03.9415-04:00</updated>
  <author>
    <name>Jeff Scanlon</name>
  </author>
  <subtitle />
  <id>http://www.artofcoding.net/blog/</id>
  <generator uri="http://www.dasblog.net" version="1.9.6264.0">DasBlog</generator>
  <entry>
    <title>Silverlight - Basics - XAML</title>
    <link rel="alternate" type="text/html" href="http://www.artofcoding.net/blog/2008/04/20/SilverlightBasicsXAML.aspx" />
    <id>http://www.artofcoding.net/blog/PermaLink,guid,69d63b7a-1f17-4f97-946f-c253d3aa8d2a.aspx</id>
    <published>2008-04-19T20:34:03.9415-04:00</published>
    <updated>2008-04-19T20:34:03.9415-04:00</updated>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <p>
Silverlight 2.0 is on its way to an official release, and with beta 1, we are now
seeing a fuller implementation, including a number of standard controls and richer
support to accomplish common tasks. If you're new to Silverlight / Windows Presentation
Foundation, the place to start is understanding XAML (pronounced zammel) - the Extensible
Application Markup Language. XAML is an XML dialect, and thus, follows the traditional
tree hierarchy of elements you're used to seeing in XML. The main features of XAML
are:
</p>
        <ul>
          <li>
Element names correspond to objects. A "&lt;UserControl ...&gt;" in XAML corresponds
to the System.Windows.Controls.UserControl class, for example. 
</li>
          <li>
Type converters understand string property values. Type converters are used to convert
a property value, such as Background="White" or Background="#FF0000" to the class
behind the property. A type converter parses and understands what you're asking for
(as long as you specify something it can convert) 
</li>
          <li>
Markup extensions. A markup extension is a special way of saying "interpret this property
value, don't take it literally or type convert it." Markup extensions are using for
referencing resources, creating control templates and data binding. 
</li>
          <li>
Dependency properties and attached properties. A dependency property is a property
that depends on potentially many things - including animation, data binding and the
value you explicitly set. An attached property is a special type of dependency property
that an object doesn't define. It is "attached" in that the object with the attached
property has the property, but it is meaningless to the object itself. The attached
property IS meaningful, however, to elements enclosing the object. A great example
is a container such as Canvas (it provides absolute positioning) and the attached
property Canvas.Left and Canvas.Top - specify these on a child object, such as an
Image, and the Canvas knows where to place the Image. 
</li>
          <li>
Direct connection to code-behind. When built, the XAML file causes generation of a
piece of the class specified by the x:Class attribute, and it is here that object
identifiers are created to connect to the objects specified by elements in the XAML
(as long as the element in XAML has an x:Name attribute defined) This connection also
includes connecting to event handlers in the code-behind.</li>
        </ul>
        <p>
This is a rather quick overview of XAML - over time we will delve deeper into these
features and all the aspects of Silverlight.
</p>
        <p>
 
</p>
        <img width="0" height="0" src="http://www.artofcoding.net/blog/aggbug.ashx?id=69d63b7a-1f17-4f97-946f-c253d3aa8d2a" />
      </div>
    </content>
  </entry>
  <entry>
    <title>Friday Cat Blogging</title>
    <link rel="alternate" type="text/html" href="http://www.artofcoding.net/blog/2008/01/05/FridayCatBlogging.aspx" />
    <id>http://www.artofcoding.net/blog/PermaLink,guid,a7aaf94f-19ed-4778-8a21-1e9ad1942b78.aspx</id>
    <published>2008-01-04T20:29:48.83-05:00</published>
    <updated>2008-01-04T20:29:48.83-05:00</updated>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <p>
Blogging more often isn't a New Year's Resolution. I wish I blogged more but life
has many distractions, much technology to pay attention to and much to focus on to
support my clients. I will blog more this year though. I have ideas I want to prototype
and release to the world on this site, and one project I know people are waiting on
(they probably think it'll never get done but it's coming once I bang out a few bugs
:( )
</p>
        <p>
While this is and will be a professional blog, I can't resist veering into the off-topic
to join in the Friday animal blogging that is customary on many blogs (hey, <a href="http://www.schneier.com/blog/archives/2008/01/lolcat_with_squ.html" target="_blank">Bruce
Schneier</a> can do it, so can I! and purely by coincidence, his is an lolcat picture
also)
</p>
        <p>
I came across this particular picture after a crazy Google search and a friend supplied
a great caption lolcats style.
</p>
        <p>
 
</p>
        <p>
          <a href="http://www.artofcoding.net/blog/content/binary/WindowsLiveWriter/FridayCatBlogging_1175B/cat20080104_6.jpg">
            <img style="border-right: 0px; border-top: 0px; border-left: 0px; border-bottom: 0px" height="387" alt="cat20080104" src="http://www.artofcoding.net/blog/content/binary/WindowsLiveWriter/FridayCatBlogging_1175B/cat20080104_thumb_2.jpg" width="401" border="0" />
          </a>
        </p>
        <p>
 
</p>
        <p>
Happy New Year, everyone.
</p>
        <img width="0" height="0" src="http://www.artofcoding.net/blog/aggbug.ashx?id=a7aaf94f-19ed-4778-8a21-1e9ad1942b78" />
      </div>
    </content>
  </entry>
  <entry>
    <title>ASP.NET AJAX and Content Management Systems</title>
    <link rel="alternate" type="text/html" href="http://www.artofcoding.net/blog/2007/10/16/ASPNETAJAXAndContentManagementSystems.aspx" />
    <id>http://www.artofcoding.net/blog/PermaLink,guid,c5a95f77-4bb5-478c-b88f-7b860fceb608.aspx</id>
    <published>2007-10-15T23:52:36.424502-04:00</published>
    <updated>2007-10-15T23:52:36.424502-04:00</updated>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <p>
So, last week at work I tackled what is probably the infamous content management issue
with ASP.NET AJAX. I've found a good thread on this at online but I wanted to
write about my investigation and solutions (gathered from a few sites and also from
tracing through our content management system) for the benefit of... someone,
I hope. :)
</p>
        <p>
Stating the problem is easy: AJAX doesn't work with our home grown content management
system. The CMS predates me by a few years, so this was my first time digging into
its guts.
</p>
        <p>
Before digging into the meat of the problem, let me share my troubleshooting configuration.
It's important to have the right tools and the right approach to simplify your life,
and in this case I used <a href="http://www.fiddlertool.com/fiddler/">Fiddler</a> as
my main tool to trace HTTP requests. Next, I created a simple ASP.NET AJAX application
- an UpdatePanel that displays the current time (calculated server side) and has a
button to perform the asynchronous post back. Executing this while Fiddler is running
gave me a clean example of a request/response on page load and on an asynchronous
post back. In some ways this is like a "control" in a science experiment, but the
comparison only goes so far (we're engineering here, not science-ing, even if we share
some thought processes)  Having an HTTP tracer and a clean point of comparison
made it a lot easier to track down the AJAX problems in our content management system
(CMS).
</p>
        <p>
The CMS is an IIS web application, and can serve both static content and dynamic content
(ASPX pages) from other websites. It uses a template system to put content together,
but the details of this aren't important to explain the issue and solution.
</p>
        <p>
Let's say the main site (the content management system) is under /CMS and a sample
content site is at /ContentSite
</p>
        <p>
The CMS application has an ASPX page named "serveContent" that is passed an identifier
which it then resolves to actual content. The requested page can contain other pages,
so CMS aggregates them and sends them to the client. This is, more or less, the expected
configuration of a content management system.
</p>
        <p>
The issue with AJAX happens when an ASPX page (with AJAX content) is aggregated
and sent to the client. The requesting page, from the client, might look like <a href="http://localhost/CMS/serveContent.aspx?id=123">http://localhost/CMS/serveContent.aspx?id=123</a> but
ASPX pages will have the "action" property of their "form" element set to something
like <a href="http://localhost/ContentSite/someAspxPage.aspx">http://localhost/ContentSite/someAspxPage.aspx</a>.
Even though I'm an experienced developer and can derive solutions on my own, it's
still quite wise to research online. Software can be complex and a solution that sounds
good requires a sanity check to make sure you aren't making a faulty assumption or
not taking something important into account, not to mention, finding solutions fast
before writing new code and debugging it. (BTW, if someone reads this and realizes
I'm doing something stupid here, please let me know!)
</p>
        <p>
At this point, I figured the solution was to modify the form's action on the response's
way out, and this page concurred <a href="http://weblogs.asp.net/jezell/archive/2004/03/15/90045.aspx">Fixing
Microsoft's Bugs: Url Rewriting</a>. The most important part of his code is the WriteAttribite
method of FormFixerHtmlTextWriter. After applying this code, and tweaking a number
of things specific to our content management system, AJAX finally worked.
</p>
        <p>
But only on the first asynchronous post back.
</p>
        <p>
The next problem was with how UpdatePanel works. I am unsure if this is a bug with
ASP.NET AJAX or there are design reasons but when an asynchronous post back occurs
with UpdatePanel, the "action" property on the "form" is forced back to its original
form. Now I need to change the action again! This fix was accomplished by changing
the response stream. I found the solution over at this page <a href="http://forums.asp.net/p/1037846/1809800.aspx">http://forums.asp.net/p/1037846/1809800.aspx</a>,
10 posts down. I'm unsure if this is the best solution, but I prefer it to a client
side solution (changing the form action using JavaScript) because it's not dependent
on the client to ensure the pages work. I didn't explore it too deeply, but apparently
the following code on the client will solve this problem. Note it's important to set
the initial action AND the action properties.
</p>
        <blockquote>
          <p>
            <font face="Courier New" size="2">Sys.Application.add_load(function() {</font>
          </p>
          <p>
            <font face="Courier New" size="2">   var form = Sys.WebForms.PageRequestManager.getInstance()._form; </font>
          </p>
          <p>
            <font face="Courier New" size="2">   form._initialAction = form.action =
window.location.href; </font>
          </p>
          <p>
            <font face="Courier New" size="2">});</font>
          </p>
        </blockquote>
        <p>
I found this snippet of code at here in the <a href="http://forums.asp.net/p/1083499/1609805.aspx#1609805">ASP.NET
forums</a>. Scroll to the bottom, to Jeffrey Zhao's post.
</p>
        <p>
I wanted as little extra code as possible to fix this issue and I'm fairly happy with
where things are. It was nice to see HTTP 200's in Fiddler with each asynchronous
post back, after taking a fair chunk of time to track this down.
</p>
        <p>
Thanks to Jesse Ezell, fitsner and Jeffrey Zhao for posting useful code online.
</p>
        <img width="0" height="0" src="http://www.artofcoding.net/blog/aggbug.ashx?id=c5a95f77-4bb5-478c-b88f-7b860fceb608" />
      </div>
    </content>
  </entry>
  <entry>
    <title>Getting excited about RIA</title>
    <link rel="alternate" type="text/html" href="http://www.artofcoding.net/blog/2007/08/10/GettingExcitedAboutRIA.aspx" />
    <id>http://www.artofcoding.net/blog/PermaLink,guid,a970cb75-44de-4f22-951f-9a7fcf4a9106.aspx</id>
    <published>2007-08-10T02:51:11.5731224-04:00</published>
    <updated>2007-08-10T02:51:11.5731224-04:00</updated>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <p>
Why should we be excited about RIA? What's the big deal? Some people have reductionist
attitudes toward new technology, focusing more on how something is the "same
ol' stuff" instead of focusing on the new aspects the technology brings to the table.
Strides forward in software technology are usually evolutionary instead of revolutionary
which makes it easy to succumb to this reductionist perspective. Software borrows
from the successes of the past and improves upon the failures. The revolution, if
it happens, is in how the technology is used. Arguably the largest world changing
technology in our lifetimes is the web, and this is just an evolution of the Internet
via a new application layer protocol (HTTP), a markup language (which evolved from
SGML), and rendering engines to display it. Some dismissed it as "nothing really new"
or thought it had nowhere to go. Why should we have gotten excited
about the web, if it was just (apparently) a few minor steps ahead of where the technology
was at the time? Others embraced it, and now look at where it is. The web is such
a great example because we can see where things are now. We know that HTTP, HTML
and a few web browsers enabled the world to change.
</p>
        <p>
New software, such as what we're seeing in the RIA world, is all about enabling something
new / making something easier. It doesn't matter if software is truly innovative
(which, if we define too narrowly, is significantly harder to achieve than most people
realize) or not. The focus should be on what we can can do <em>now</em> that
we couldn't do <em>yesterday</em>, or what we can do <em>now</em> that was hard
to do <em>yesterday</em>. This perspective is far from original, but it's useful to
bring to the front of our minds as we consider where RIA can go.
</p>
        <p>
So, what really are we gaining? As I mentioned in my previous post, I see RIA as a
shift in perspective. Website designers being able to develop desktop applications,
and .NET developers being able to use their skill set for desktop and web user interfaces,
both without massive retraining, enables people to contribute their expertise in new
ways to the applications that are developed. We're bridging worlds that co-existed
but weren't as blended together as they are now. The technology is enabling this blending.
JavaScript and HTML and CSS are leaving the browser. XAML is spoken by both development
(Visual Studio) and design tools (Expression suite). Some stuff is new (XAML), other
stuff (<a href="http://msdn2.microsoft.com/en-us/library/ms536471.aspx">HTML applications</a> anyone?)
is made easier. And this intersection is just the beginning (notice I'm falling into
a trap of focusing on just the front end - there's exciting stuff happening in
other layers too!)
</p>
        <p>
Some people will dismiss what's happening as "nothing really new" or "just more technology"
but the possibility is here, right now, to make this so much more. This is why we
should get excited about RIA. Many people are already exploring it, figuring
out what works and what doesn't. I'm not totally sure where everything is heading.
I'm trying to chart a course like everyone else and people far smarter than I
am are leading the way. But I can't get rid of the feeling that we're at the
beginning of something significant. It seems we've come so far but that's nothing
compared to the changes we'll see in the coming years. The first step, what we're
seeing now, is a lot of exploration and playing with the technology to see what we
can do, as we also figure out what we should do. People's approach to application
design will change as high profile projects become successful and as the industry
discovers and publishes best practices, etc. Security will be a big deal. How well
applications are developed will affect user's perception of the technology and of the
companies that build it. There's a lot to think about and there will be more people
discussing it. This can be a good thing and a bad thing, but it's definitely
an exciting thing.
</p>
        <p>
I'm unsure if I'm saying anything useful here or not. I do know that I'm a code junkie
and am rather surprised I have two opinion pieces on RIA within a week. I love implementing
the things I imagine in my head, whether I'm making money off the code or not. So
far I haven't discussed code much with WPF/Silverlight/etc. I need to change
that on here, but I want to contribute something useful, so stay tuned, I'll get there
as soon as I can! Thanks for reading.
</p>
        <img width="0" height="0" src="http://www.artofcoding.net/blog/aggbug.ashx?id=a970cb75-44de-4f22-951f-9a7fcf4a9106" />
      </div>
    </content>
  </entry>
  <entry>
    <title>WS Interoperability - Java / .NET - wsimport issue</title>
    <link rel="alternate" type="text/html" href="http://www.artofcoding.net/blog/2007/08/09/WSInteroperabilityJavaNETWsimportIssue.aspx" />
    <id>http://www.artofcoding.net/blog/PermaLink,guid,7f818849-5bf0-410e-9037-e43919ec1d4d.aspx</id>
    <published>2007-08-08T23:47:26.0731224-04:00</published>
    <updated>2007-08-08T23:47:26.0731224-04:00</updated>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <p>
Ah yes, the perils of having co-authored a Java book and now working in a .NET shop
:) Whenever a Java project arises, it comes my way. (Okay, I'm not REALLY complaining,
but it's humorous to me) This project seemed easy enough: someone else wrote a web
service in C#. I needed to write Java code to invoke the .NET web service and package
it as a small SDK of sorts. So I hit the web service, pulled down the
WSDL, ran it through Axis, and I got some code. However, I didn't want the additional
dependency on Axis when delivering the Java code to our customer, so I tried
using wsimport, a tool that comes with the JDK. Should work right away, right?
</p>
        <p>
Sadly, no. I got an error about the s:schema being undefined, so I tried passing the
address of XMLSchema.xsd from w3.org as a binding via the -b option.
</p>
        <p>
Then I got a confusing error that left me scratching my head for longer than I care
to admit. It is because of this error that I am going to the trouble of writing this
post. I found exactly one hit on Google about this error and that search result simply
had someone replying "I haven't seen this before....I think you solve it by....."
which unfortunately was useless to me.
</p>
        <p>
To cut to the point, the issue was one of the properties on the C# object is
a DataSet and wsimport ain't very happy with DataSets since the types cannot
be determined until runtime. The solution is straight forward: use a typed DataSet.
There's a ton of documentation online describing how to create typed DataSets in Visual
Studio. I couldn't find this information, though. Nobody linked the following error
with DataSet interoperability issues, for whatever reason. It's likely that this is
NOT the only case where wsimport will output the error message below. If this post
doesn't help you and you're getting this error, check interoperability problems, check
that the XMLSchema.xsd namespace/location is correct (specify -b if needed, perhaps),
and if none of this helps - take a WSDL you know wsimport will work with, compare
it to the WSDL that's failing, see where the differences are and start removing constructs
that seem out of place. Once the problem WSDL works, look at the last construct you
removed, then you'll have your problem narrowed down.
</p>
        <p>
Here's the error I received from wsimport:
</p>
        <blockquote>
          <p>
            <font face="Courier New" size="2">C:\javacode\ws&gt;wsimport DotNetWebService.wsdl
-b <a href="http://www.w3.org/2001/XMLSchema.xsd">http://www.w3.org/2001/XMLSchema.xsd</a></font>
          </p>
          <p>
            <font face="Courier New" size="2">XML reader error: javax.xml.stream.XMLStreamException:
ParseError at [row,col]:[112,36]<br />
Message: A '(' character or an element type is required in the declaration of element
type "xs:schema".<br />
XML reader error: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[112,36]<br />
Message: A '(' character or an element type is required in the declaration of element
type "xs:schema".<br />
at com.sun.xml.internal.ws.streaming.XMLStreamReaderUtil.wrapException(XMLStreamReaderUtil.java:249) </font>
          </p>
        </blockquote>
        <p>
Once the DataSet was changed to a strongly typed DataSet, the additional binding option
wasn't needed. You can execute <font face="Courier New" size="2">wsimport DotNetWebService.wsdl</font> and
end up with your code.
</p>
        <p>
I hope this helps someone out there
</p>
        <img width="0" height="0" src="http://www.artofcoding.net/blog/aggbug.ashx?id=7f818849-5bf0-410e-9037-e43919ec1d4d" />
      </div>
    </content>
  </entry>
  <entry>
    <title>An Arithmetic Puzzle</title>
    <link rel="alternate" type="text/html" href="http://www.artofcoding.net/blog/2007/08/07/AnArithmeticPuzzle.aspx" />
    <id>http://www.artofcoding.net/blog/PermaLink,guid,65224ed3-f3be-43a6-b82b-2441d3b0e4fc.aspx</id>
    <published>2007-08-06T23:22:38.9793724-04:00</published>
    <updated>2007-08-06T23:22:38.9793724-04:00</updated>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <p>
Okay, so I go to write a post today and find I have some good longer posts but
didn't feel like putting one of those together today. And unfortunately, I struggled
to come up with a short topic :) To the few readers I have at the moment,
I'll instead post a little puzzle that one of my smartest friends shared with me.
It's the type of thing you see right away or get stuck for awhile, I think.
</p>
        <p>
In this equation: <font face="Courier New" size="2">101 - 102 = 1</font></p>
        <p>
Move one and only one digit to make this equation true. By "one digit" I mean literally
one digit - not one "type" of digit - so you can't move more than one 1 for example.
</p>
        <p>
Enjoy
</p>
        <img width="0" height="0" src="http://www.artofcoding.net/blog/aggbug.ashx?id=65224ed3-f3be-43a6-b82b-2441d3b0e4fc" />
      </div>
    </content>
  </entry>
  <entry>
    <title>Addressing Schema Validation Issue with WorkItemTypeSchema.wsd</title>
    <link rel="alternate" type="text/html" href="http://www.artofcoding.net/blog/2007/08/06/AddressingSchemaValidationIssueWithWorkItemTypeSchemawsd.aspx" />
    <id>http://www.artofcoding.net/blog/PermaLink,guid,19efc25e-8d06-465d-9f33-9cf7b404775b.aspx</id>
    <published>2007-08-06T00:11:00.3699974-04:00</published>
    <updated>2007-08-06T00:11:00.3699974-04:00</updated>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <p>
Many within the company I work for are happy with the deployment of Team Foundation
Server (TFS), including people that tend toward the skeptical side with all of Microsoft's
offerings. It's great to see such a powerful, integrated tool actually helping development
and project management as promised. I am still the main developer supporting our deployment
of TFS, so when our configuration management (CM) group requires a tool or any extensibility
done to TFS (such as custom controls in our work item types) I am the one to do the
implementation.
</p>
        <p>
At the moment, CM is looking for some specialized handling of work item type files,
details of which I'll possibly outline in a future post. Since this requires processing
the work item type XML files, I went to retrieve the <a href="http://www.microsoft.com/downloads/details.aspx?familyid=51a5c65b-c020-4e08-8ac0-3eb9c06996f4&amp;displaylang=en" target="_blank">Visual
Studio SDK</a> which includes the schema files for the work item types.
</p>
        <p>
The xsd program generates code based on the original schemas. However, passing
these files through any validator, such as XMLSpy's, reveals problems with the XML.
Since xsd generates code I can use, I could stop here, but I wanted to explore the
XML and see if there's a way to fix it. Also, I don't think this is a blocking issue
for many at all, explaining why I haven't seen much about this online.
</p>
        <p>
After opening WorkItemTypesSchema.xsd in XMLSpy we get this warning message:
</p>
        <blockquote>
          <p>
          </p>
          <p>
Some of "include" and/or "import" and/or "redefine" statements in the following files
have no schemaLocation attribute and will be ignored!
</p>
          <p>
          </p>
        </blockquote>
        <p>
Step 1 was to address this issue, so I added schemaLocation="typelib.xsd" to the import
of typelib.xsd near the top of the file.
</p>
        <p>
Then I saved and got the following error message:
</p>
        <blockquote>
          <p>
This file is not valid! If you save the file in its current state, other XML processors
may have a problem opening the file. 
</p>
        </blockquote>
        <p>
When something like this happens, I struggle to find out "why?" If you read Raymond
Chen's blog regularly, there are many instances where software such as Windows <em>seems</em> to
do something boneheaded but upon thinking about it for a few moments (or viewing it
in light of backward compatibility) it makes sense. I struggled for a few days to
come up with the best answer I could to explain this failure to validate and
I only see one likely possibility, but we'll get to that shortly. 
</p>
        <p>
So, why is the schema file invalid? It's because of non-determinism. Searching for
"XML" and "non-determinism" on Google brings us to this <a href="http://msdn2.microsoft.com/en-us/library/9bf3997x(VS.71).aspx">http://msdn2.microsoft.com/en-us/library/9bf3997x(...</a> page
at msdn, that states: 
</p>
        <blockquote>
          <p>
A deterministic schema is a schema that is not ambiguous, allowing the parser used
by the Schema Object Model (SOM) to determine the sequence in which elements should
occur in order for an XML document to be valid. It is possible for an XML Schema to
be ambiguous, or non-deterministic. A schema is considered to be non-deterministic
if the parser is unable to clearly determine the structure to validate with the schema.
When validation is attempted on a non-deterministic schema, the parser used by the
SOM generates an error.
</p>
        </blockquote>
        <p>
I now have two conflicting pieces of information. I have XMLSpy telling me the schema
isn't valid. Then I have msdn at microsoft telling me schemas <em>can</em> be ambiguous
coupled with xsd successfully generating code. If we follow the cos-nonambig link
in the error window of XMLSpy, it brings us to this page <a href="http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/#cos-nonambig">http://www.w3.org/TR/2004/REC-xmlschema-1-20041028...</a> that
states: 
</p>
        <blockquote>
          <p>
A content model must be formed such that during <a href="http://www.w3.org/#key-vn">·validation·</a> of
an element information item sequence, the particle component contained directly, indirectly
or <a href="http://www.w3.org/#key-impl-cont">·implicitly·</a> therein with which
to attempt to <a href="http://www.w3.org/#key-vn">·validate·</a> each item in the
sequence in turn can be uniquely determined without examining the content or attributes
of that item, and without any information about the items in the remainder of the
sequence.
</p>
        </blockquote>
        <p>
We expect the ultimate authority is w3.org. After more Google searches, it appears
the existence of ambiguous schemas is (or was, hopefully) expected, even if these
schemas don't validate. Rick Jelliffe, a former member of the XML Schema Working Group
says: 
</p>
        <blockquote>
          <p>
I have received several very negative reports on the state of interoperability of
tools using XML Schema ... The most common complaint is tools that generate ambiguous
XML Schemas ... Ambiguous schemas effectively break everything downstream (<a href="http://www.w3.org/2005/05/25-schema/rick.html">http://www.w3.org/2005/05/25-schema/rick.html</a>)
</p>
        </blockquote>
        <p>
Okay, so the schema isn't valid, it apparently violates the specification at w3.org,
and we have someone that should have some authority on this matter saying ambiguous
schemas are bad. In spite of msdn acknowledging ambiguous schemas can exist (and corroborated
by other sites I browsed), I think Microsoft should have made this schema validate. 
</p>
        <p>
XMLSpy indicates the &lt;xs:complexType name="FieldDefinition"&gt; tag is where
the non-determinsm exists. We see the non-determinism almost immediately (if we know
how to identify it) in the following lines 
</p>
        <blockquote>
          <p>
            <font face="Courier New" size="2">&lt;xs:complexType name="FieldDefinition"&gt;<br />
  &lt;xs:sequence&gt;<br />
    &lt;xs:group ref="PlainRules" minOccurs="0" maxOccurs="unbounded"/&gt;<br />
    &lt;xs:element name="HELPTEXT" type="HelpTextRule" minOccurs="0"/&gt;<br />
    &lt;xs:group ref="PlainRules" minOccurs="0" maxOccurs="unbounded"/&gt;<br />
  &lt;/xs:sequence&gt;</font>
          </p>
        </blockquote>
        <p>
This specifies a sequence of PlainRules (0 or more), followed by zero or one occurence
of a HELPTEXT element, followed by 0 or more PlainRules again. I'm a little fuzzy
on the default value of maxOccurs with the HELPTEXT element (and can't easily verify
at the moment), so it might be 0 or more HELPTEXTs (and would explain why this isn't
an attribute) but the following discussion holds true whether it's "0 or 1" or "0
or more" HELPTEXTs. Let's simplify this for purposes of explanation and call PlainRules
"A" and HELPTEXT "B". Using symbols from regular expressions (and formal languages
and such, stretching back to university) we'll use * as "0 or more occurrences"
and ? as "0 or 1 occurence." This allows us to look at this XML fragment containing
the non-determinism as: 
</p>
        <blockquote>
          <p>
A* B? A*
</p>
        </blockquote>
        <p>
I went to the trouble of converting to these symbols in order to easily illustrate
the non-determinsm. By the XML schema specification, each element (the particle component)
must non-ambiguously belong to a predictable part of the schema. The following sequences
belong to the language A* B? A* 
</p>
        <ul>
          <li>
AAAAABAAAAA 
</li>
          <li>
ABA 
</li>
          <li>
AABAA 
</li>
          <li>
AB 
</li>
          <li>
BA</li>
        </ul>
        <p>
All the preceding sequences avoid the ambiguity issue. The A's that come BEFORE the
B belong to the first A group, and all the A's that come AFTER the B belong to the
second A group. The ambiguity is solved here by the presence of B, but the language
states B is optional, so the following are also valid members of the language A* B?
A* 
</p>
        <ul>
          <li>
A</li>
          <li>
AAAAAAAAA 
</li>
          <li>
AAA</li>
        </ul>
        <p>
This is where we encounter the ambiguity. When there is a single A, does that
A belong to the first group of A's or the second? When there is a sequence of A's,
such as "AAA" then it's just as likely for any of the A's to belong to the A* before
the B? as it is for them to belong to the A* after. There is no way to know which
A* an A belongs without mandating B in the middle. 
</p>
        <p>
The only reason I can see for Microsoft to introduce this ambiguity is to make it
easier for people modifying the work item type definitions. These people can place
HELPTEXT anywhere as a child of the FIELD element, instead of mandating that HELPTEXT
appear first. It seems a straightforward requirement to say "HELPTEXT" must appear
in a specific position rather than anywhere in a big mess of PlainRules, since all
other elements must be placed precisely. 
</p>
        <p>
In order to fix this issue, any tweaks done to the schema must not break the existing
schema (such that a work item type validates against both the original and the fixed
schemas). The most straightforward way I came up with is to mandate that HELPTEXT,
if present, is the first child of the FIELD element. This means the language A* B?
A* is rewritten as B? A*. Now the parser knows if there are any A's (PlainRules),
they match the one and only A* specification. I changed the FieldDefinition type in
the schema to 
</p>
        <p>
        </p>
        <blockquote>
          <p>
            <font face="Courier New" size="2">&lt;xs:complexType name="FieldDefinition"&gt;<br />
   &lt;xs:sequence&gt;<br />
      &lt;xs:element name="HELPTEXT" type="HelpTextRule"
minOccurs="0"/&gt;<br />
      &lt;xs:group ref="PlainRules" minOccurs="0" maxOccurs="unbounded"/&gt;<br />
   &lt;/xs:sequence&gt;</font>
          </p>
        </blockquote>
        <p>
For those a step ahead of me, you'll realize that if HELPTEXT appears after any PlainRules
in a work item type, it no longer validates against this revised schema (while still
validating against the original). Since I'm writing a tool to manipulate
work item types, my fix is to execute code to rewrite the work item type, ensuring
any occurence of HELPTEXT appears as the first child of any FIELD elements.
I'll be placing this code on this site shortly incase anyone wants to use it. 
</p>
        <p>
There's one more modification needed to the XML schema so it validates - the two regular
expressions used at the bottom (in SizeType and PaddingType) have their commas escaped.
Once the backslashes before the commas are removed, and the other changes done,
the XML file validates fine. 
</p>
        <p>
Please note that I don't know every corner of the XML schema specification, nor have
I explored the Orcas or Rosario TFS versions, so information here might be incorrect
or out of date shortly. I imagine I'll update this topic after seeing how things change
in Orcas/Rosario.
</p>
        <img width="0" height="0" src="http://www.artofcoding.net/blog/aggbug.ashx?id=19efc25e-8d06-465d-9f33-9cf7b404775b" />
      </div>
    </content>
  </entry>
  <entry>
    <title>Weighing in on RIA</title>
    <link rel="alternate" type="text/html" href="http://www.artofcoding.net/blog/2007/08/05/WeighingInOnRIA.aspx" />
    <id>http://www.artofcoding.net/blog/PermaLink,guid,31ed32c4-bbc9-4fd0-b3ed-71f251ea85c5.aspx</id>
    <published>2007-08-05T03:51:56.6981224-04:00</published>
    <updated>2007-08-05T03:51:56.6981224-04:00</updated>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <p>
There's much discussion of RIA in the blogosphere, most of which I've been reading
at some <a href="http://scarynoises.com/blog/">Microsoft</a><a href="http://blogs.msdn.com/tims/default.aspx">blogs</a> and
some <a href="http://madowney.com/blog/">Adobe</a><a href="http://blog.digitalbackcountry.com/">blogs</a>.
I'm not an artist, even my stick figures reveal my lack of skill (and I don't have
the patience to get good like Richard Feynman did), but clean user interfaces and
strong user experiences have been a side interest of mine dating back to my first
reading of <a href="http://www.amazon.com/Design-Everyday-Things-Donald-Norman/dp/0465067107/ref=pd_bbs_sr_1/105-0724749-7997202?ie=UTF8&amp;s=books&amp;qid=1186299915&amp;sr=8-1">Design
of Everyday Things</a>. I've been turning my attention more toward RIA and the supporting
infrastructures with this next wave of technology that includes WPF and Silverlight
and Adobe AIR and Flash and Flex and others, and eventually might include Seadragon
and Surface, and whatever technology other companies come out with. My views align
with <a href="http://blogs.msdn.com/msmossyblog/">Scott Barnes</a> in general,
but I wanted to hash out what I think in writing about RIA since it'll force
me to organize my thoughts. I see RIA as more of a shift in philosophy, a fresh approach
to application design, than as a specific technology, though the new technologies
enable RIA development.
</p>
        <p>
There are two significant perspectives in the RIA picture: that of the developer and
that of the user.
</p>
        <p>
(Note: I recognize I am speaking in generalities and not supporting everything I
say with evidence, but bear with me. If you're reading this and disagree, chime in)
</p>
        <h3>The Developer Perspective
</h3>
        <p>
As developers, we tend to lose sight of the bigger picture. We get mired in discussions
of what technology is better, why X language sucks because it doesn't support Y feature
and why Z company is superior. These are religious discussions in the technology world
and I have very little patience for them. There is no one language, no one platform
that will be the best answer in absolutely every case, and when we have choice, why
do we waste time complaining about a certain technology? We figure out which
technology best allows us to solve a particular problem and we move forward. Sometimes
we deal with constraints (like working at a company that only uses Microsoft technology)
but as long as the job can get done, we're fine with whatever we work with (or we
quit to work for a company more in line with our personal preferences).
</p>
        <p>
I also see some software developers constantly trying to shift perception
of Microsoft technology by only focusing on the negatives and ignoring the positives.
On one hand, this indicates people are holding Microsoft to a higher standard - they
expect perfection and nail Microsoft when perfection is missed. But this view ignores
the reality - Microsoft is like any other big software company. Product quality varies
from one to another. Some features may make no sense outside the design meetings where
an imperfect decision <em>had</em> to be made. Other decisions are actually the right
ones from one angle but wrong from a different angle. And some decisions are made
that are just wrong. I hate blanket statements I've seen online that say "everything
from Microsoft sucks" because it ignores the many successes and reveals the commenter's
ignorance. There are legitimate reasons to zing Microsoft (for example, no label viewer
in TFS 1.0? I know they have deadlines and have to cut features, but still, that feature
got cut? :) ) so let's shy away from dismissing Microsoft - or any company - out of
hand.
</p>
        <p>
I used to hate Microsoft over ten years ago. Even back then it was hip to hate Microsoft.
But my views changed abruptly when I gave MS technology a serious chance. It might
have been Internet Explorer surpassing Netscape that got me hooked, I'm not sure.
I haven't let go for two main reasons: Microsoft technology, as a whole, is actually
quite nice; and, I can get my job done fast. I've had to wrestle far less with Microsoft
technology than other technology. I keep up with other technology for the times when
MS doesn't have what I need, or when I have technology constraints I can do nothing
about. I mention this background because the "let's hate Microsoft" and "Silverlight
will fail" discussions are nothing new to me. People think Windows is dying or new
technology from Microsoft is a failure and these people totally miss the point. Wishing
Microsoft would go away didn't work 10+ years ago and it's not going to
work now.
</p>
        <p>
At the end of the day, software engineers are problem solvers. We implement solutions
in whatever domain we live and work in. Can .NET help us do this? Yes. Can the Java
platform? Yes. Can I roll out professional websites using IIS, ASP.NET, Windows Server
2003, etc.? Yes. Can I do the same with LAMP? Yes. This is why the religious discussions
should stop in our industry. The people that hate Microsoft will continue to hate
them, the people that don't like Java or open source will continue their avoidance,
but the funny thing is, these factions will continue to solve problems and continue
being productive (hopefully!)
</p>
        <p>
I'm semi-ranting and meandering a bit, but what I'm getting at is Silverlight/WPF
aren't going anywhere and we need a more open perspective as software engineers. There's
much about Silverlight that should get recognized as "cool" and important:
</p>
        <ul>
          <li>
Cross-platform CLR that does not require the .NET framework</li>
          <li>
DLR, the dynamic language runtime, extending the language support of .NET even further</li>
          <li>
Cross-platform support of a subset of XAML/WPF (which I imagine will grow closer to
the full implementation over time)</li>
          <li>
XAML XAML XAML. I'm incredibly excited about XAML because I see it as "the new HTML."
Once Silverlight has strong penetration, website designers can choose to develop sites
in XAML and be confident people can view them. No more dealing with messy HTML/CSS
and testing on every browser to make sure the site looks/works the same.</li>
        </ul>
        <p>
I also like that the technologies on the Adobe side (HTML, JS, Flash, Flex) are given
a home on the desktop via AIR. This makes it easy for website designers to extend
their skill set to the desktop. Adobe brought the design world to desktop applications
and Microsoft brought developers to the world of rich application design, far surpassing
the stodgy world of the past (MFC, WinForms, etc.) There's plenty reason to get excited
about <em>both</em> technologies. (For the record, yes, I'm aware of Sun's offering,
but no comments at the moment) We must be responsible software engineers moving forward
as the RIA world evolves, and this means staying well informed about as much technology
as we can.
</p>
        <p>
Which technology will "win?" That's the wrong question to ask because there's no competition.
Both will continue to exist, each caters to a different type of developer, and the
people that really matter in the end are the users.
</p>
        <h3>The User Perspective
</h3>
        <p>
I envision a spectrum of users, from those with the absolute bare minimum of knowledge
required to use computers to those that are fairly sophisticated but don't do
software development. The one thing that unites users is they want their software
to work. This is a simple goal at its most basic for software developers, but also
a tough goal because everyone uses software a little different. Some people will
love an application and others will hate it, either because certain features are hard
to use or the user's sense of how a feature should work is different from how it actually
works. The larger our user base is for a product, the more we have to first focus
on the functionality that affects 90% of the users, and then going forward we can
refine the product to work well with as many additional users as possible. This is
reality again intruding on what we create - limited resources, limited time,
etc. We can also never win 100% of the users - I doubt any product can. There
are people <a href="http://tedblog.typepad.com/tedblog/2005/11/10_reasons_to_h.html">that
don't like iPods due to bad experiences</a>, but it doesn't hinder the success of
the iPod.
</p>
        <p>
Let's start at the basic end of the spectrum. Basic users want things to "just work,"
whether it's their car or their TiVo or their operating system or some other software.
They don't know how it works, they don't care how it works. If it breaks, they want
it fixed. Take a car to a mechanic, call the Maytag repairman (okay, maybe call him
to fix your cable), get the neighbor's kid to remove spyware. These are the users
that don't care if software automatically updates itself - <em>as long as the updates
don't break anything.</em> We can't disregard these users when we write software,
or discount how many of them there are. These users are a significant part of the
reason Microsoft stays as committed to backwards compatibility as they do - if users'
existing software didn't work on a new platform, they'd refuse to upgrade, or worse,
refuse to use Windows going forward. Whether a site is implemented in Adobe technology
or Microsoft technology or whatever, the users don't know and don't really care. Why
should they? They want sites they visit and links shared by friends to work, they
want to read and respond to e-mail without a hassle.
</p>
        <p>
As we move to the other end of the spectrum, we find users that have an increased
knowledge of their software and how it works. They'll know where the advanced configuration
dialogs are and will pretty much understand all the options. These users <em>might </em>turn
off automatic updates in order to have more control, but the only reason they'd do
this is if they've been burnt by automatic updates in the past. These users are more
informed about technology and might have strong opinions. The more sophisticated a
user is, the more control he wants over his world. It's probably why they are sophisticated
to begin with - dissatisfaction with the default configuration, a yearning
to understand all they can, or they have specific needs met only by the nether regions
of a program (think about how many features Word offers that most users don't use).
</p>
        <p>
The difficulty in developing software is knowing just how many options to expose and what
sort of application design will appeal to the majority of users, and hopefully
to all users. Most users won't explore configuration too deeply and will in fact be
intimidated by too many options. Most users don't want a deep level of choice - again,
they simply want software that does what they expect - though we must balance this
with what the more sophisticated users want. When we design applications in the near
future, we have to think deeply about users. It <em>is</em> a challenge, but the evolution
that is occurring in the software industry can help elevate the nature of applications
we design.
</p>
        <h3>Conclusion
</h3>
        <p>
It appears I've wandered far away from RIA, but I haven't. I see Rich Interactive
Application design (yes, I prefer this term, and no, not simply because I'm towing
the MS line) as a refocusing on the user via rethinking our user interfaces and
application designs, whether applications are on the Internet or not. Using new
technology, whether it's Flex or Silverlight or something else, opens more possibilities
for us as software engineers. I think we're at the beginning of the next major wave
of how people interact with computers and it's definitely an exciting time to contribute
our vision and our expertise. It is important to raise our consciousness about this
shift and move away from arguing about which technology is superior. We're all in
this together, now let's get to the work of building awesome, useful technology
using the tools given to us.
</p>
        <img width="0" height="0" src="http://www.artofcoding.net/blog/aggbug.ashx?id=31ed32c4-bbc9-4fd0-b3ed-71f251ea85c5" />
      </div>
    </content>
  </entry>
  <entry>
    <title>WPF Basics: Type Converters</title>
    <link rel="alternate" type="text/html" href="http://www.artofcoding.net/blog/2007/08/04/WPFBasicsTypeConverters.aspx" />
    <id>http://www.artofcoding.net/blog/PermaLink,guid,68c870ee-9eb0-484e-8c61-d81972cb07e5.aspx</id>
    <published>2007-08-03T22:27:07.6668724-04:00</published>
    <updated>2007-08-03T22:27:07.6668724-04:00</updated>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <p>
Windows Presentation Foundation introduces a number of new concepts, such as XAML,
dependency properties, data binding (in the WPF/XAML world), and type converters.
Let's take a brief look at type converters and how they're used in WPF.
</p>
        <p>
Since XAML is an XML dialect, parameters to objects are specified as strings. The
parameters aren't usually actually strings, so we need a way for the XAML parser to
convert the string to the correct object type. This is accomplished via type converters.
</p>
        <p>
Using a type converter is XAML is easy, for example:
</p>
        <p>
          <font face="Courier New" size="2">&lt;Button Content="Accept" Background="Blue" /&gt;</font>
        </p>
        <p>
The color specified as a string is converted to a Color object by the XAML parser,
since the Button class' Background property is of type Color.
</p>
        <p>
A type converter is obtained by passing a type of object to the GetConverter method,
like this:
</p>
        <p>
          <font face="Courier New" size="2">System.ComponentModel.TypeDescriptor.GetConverter(typeof(<em><font face="Arial">type</font></em>));</font>
        </p>
        <p>
The TypeConverter class has a number of useful methods, some of which are:
</p>
        <ul>
          <li>
CanConvertFrom: specifies which type it can convert from</li>
          <li>
ConvertFrom: performs conversion</li>
          <li>
CanConvertTo: specifies which type it can convert to</li>
          <li>
ConvertTo: performs conversion</li>
          <li>
IsValid: validates whether an object is valid for this type</li>
          <li>
CanCreateInstance: whether the converter can perform creation of an object based on
property values</li>
          <li>
CreateInstance: Re-creates an object based on an IDictionary of properties</li>
        </ul>
        <p>
One of the beautiful things about XAML and other bits introduced with WPF is
that they aren't WPF specific. XAML is an application markup language that mirrors
.NET classes in markup, so can be used outside WPF. Type converters are no different
- the support is built into .NET 3.0+ so you can add knowledge of these bits to
your tool kit and use them where appropriate.
</p>
        <p>
If you want to implement your own type converter, reference this link at MSDN: <a href="http://msdn2.microsoft.com/en-us/library/ayybcxe5.aspx">http://msdn2.microsoft.com/en-us/library/ayybcxe5....</a></p>
        <img width="0" height="0" src="http://www.artofcoding.net/blog/aggbug.ashx?id=68c870ee-9eb0-484e-8c61-d81972cb07e5" />
      </div>
    </content>
  </entry>
  <entry>
    <title>Useful tools: Scanners and Parsers</title>
    <link rel="alternate" type="text/html" href="http://www.artofcoding.net/blog/2007/08/03/UsefulToolsScannersAndParsers.aspx" />
    <id>http://www.artofcoding.net/blog/PermaLink,guid,54f07327-ae99-4ba1-b234-aa7e29835f81.aspx</id>
    <published>2007-08-03T00:05:06.5731224-04:00</published>
    <updated>2007-08-03T00:05:06.5731224-04:00</updated>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <p>
Scanning is the process of breaking input up into discrete tokens (such as one or
more letters forming a word) and parsing is the process of applying meaning to these
tokens (such as multiple words strung together to form a sentence, following specific
grammatical rules). Software developers come from a variety of backgrounds, and while
some may remember using these tools to construct a compiler in their Computer Science
undergrad, I have a feeling many are unfamiliar with (or forgot about) these tools. I'd
wager that the most well known scanner and parser, in the general pool of software
developers, are lex and yacc. The "lex" name is short for lexical analyzer, a
tool that understands a syntactic unit of information, such as a word in an English
sentence, and feeds these units to another program one at a time. The "lex" tool is
a scanner generator since it creates code that scans input. The "yacc" name is
short for "yet another compiler compiler" which is accurate since the common usage
of a parser is to create a compiler, so you can say the parser compiles a compiler.
In general, though, "yacc" is a parser generator.
</p>
        <p>
Remember that a compiler is nothing more than a translator. When you write code
in any .NET language, a compiler translates the high level code (such as C#) to a
lower level code (Intermediate Language, or IL) which is what the Common Language
Runtime (CLR) understands and executes. In a way, a decompiler is actually a compiler,
translating the low level code, such as IL, to a higher level language, such as C#.
We use the terms "compiler" and "decompiler" because they communicate the direction
of the translation. Compilers must scan input and then parse it. They must know that
"public" is a token and "class" is a token and what an identifier looks like, but
it's the parser that understands what a method is and what a while loop looks
like and how they translate to IL.
</p>
        <p>
A scanner and parser can be used separately. I won't go into details of constructing
a parser in this post, but I will discuss developing a simple scanner using a
scanner configuration that is translated into C# code. Before delving into an
example, why do we care about a tool that can create a scanner for us? What are the
benefits? Much processing of input we do in the business world (at least in my experience)
is easily constructed by hand. We don't usually need to construct a compiler, so isn't
using a scanner overkill? Like any problem we're called to solve, it's important to
know about as many tools as possible, so when we encounter a case where a certain
tool would save us significant time, we can use it since we know it exists. There
are also business problems, such as sophisticated data translation, where a scanner/parser
might be the perfect set of tools. Here are a few benefits to using an auto-generated
scanner:
</p>
        <ul>
          <li>
A scanner generator allows us to focus on the syntactic elements and not worry about
any other details 
</li>
          <li>
If syntactic elements change, it's easy to update the code file used as input
to the scanner generator  
</li>
          <li>
A scanner can feed anything - a parser can accept the tokens, your custom code
can, etc., so the syntactic analysis of input is separated from applying meaning to
the input 
</li>
          <li>
Development of scanner is much faster than rolling one by hand, unless input is rather
simple, and relying on a scanner generator reduces chances of introducing error into
the scanner 
</li>
          <li>
A generated scanner is typically faster than one you roll on your own</li>
        </ul>
        <p>
A scanner generator for C# that I've used is available at this link <a href="http://www.infosys.tuwien.ac.at/cuplex/lex.htm" target="_blank">C#Lex
site</a></p>
        <p>
The syntax of the input files to C#Lex follow the syntax detailed at this <a href="http://www.infosys.tuwien.ac.at/cuplex/lex_mirr.html" target="_blank">JLex
page</a> except where called out at the C# Lexer site.
</p>
        <p>
Let's look at an example: analyzing simple English. Words in English can take on multiple
forms:
</p>
        <ul>
          <li>
Words with an initial capital, such as at the beginning of a sentence or proper names 
</li>
          <li>
Contractions - words with an apostrophe (e.g., can't, don't, won't) 
</li>
          <li>
Abbreviations - words that are all capitals and optionally have a period after each
letter (e.g., IL, CLR, e.g.) 
</li>
          <li>
Quoted words - single or double quotes on both sides (e.g., "compiler") 
</li>
        </ul>
        <p>
We'll stop there in the interest of keeping this simple.
</p>
        <p>
An input file to the C#Lex program is formatted in three sections, each section separated
with a double percent (%%) on a line of its own
</p>
        <ol>
          <li>
User code. This section is copied directly to the output file without modification,
so you can place implementation and 'using' statements here. 
</li>
          <li>
Directives to control C#Lex. This is where you can control C#Lex, such as specifying
scanning states, and also specify regular expressions to match input. 
</li>
          <li>
Scanner rules. This is where you specify what to recognize, what to do with it, state
transitions, etc. When a rule matches here, data can be returned to the code running
the scanning loop via a special class called Yytoken (that you define).</li>
        </ol>
        <p>
The scanning states allow recognition of different syntactic units at different times,
so if certain syntactic units can only follow other syntactic units (think about visibility
keywords such as 'public' and 'private' that can't appear outside a namespace declaration)
you can control this in the scanner generator code.
</p>
        <p>
The program we'll write is dead simple for purposes of illustration: it'll accept
tokens from the scanner and output each token, one per line.
</p>
        <p>
I won't go into details of the regular expression language, instead keeping it simple
and showing the regular expressions we need without much explanation.
</p>
        <p>
A word is simple: [A-Z]?[a-z']+
</p>
        <p>
This gets us an optional initial capital and a sequence of one or more lower case
letters and apostrophes. This doesn't limit the number of apostrophes, so let's revise
it.
</p>
        <p>
[A-Z]?[a-z]+'?[a-z]*
</p>
        <p>
We can continue refining to ensure the end of the contraction is one of just a few
options (such as "t" or "s") but let's keep it simple.
</p>
        <p>
A word can also have all capitals: [A-Z]+
</p>
        <p>
optionally separated by periods: ([A-Z]\.?)+
</p>
        <p>
but periods can also separate lowercased words: ([a-z]\.?)+
</p>
        <p>
Combining these gives us: [A-Z]?[a-z]+'?[a-z]* | ([A-Z]\.)+ | ([a-z]\.)+
</p>
        <p>
We continue like this until we end up with a set of regular expressions that fully
describe the input. Since the dot matches any character except newlines, we'll add
a rule to pass this input back as a catch all rule. You probably don't want this in
a real application, but it illustrates how to match any input not matched by other
rules. Any whitespace is skipped over, along with punctuation we're not interested
in (exclamation point, question mark, commas, periods). These regular expressions
are far from perfect or comprehensive but they illustrate the process of analyzing
the nature of the input and constructing the required expressions to scan the
input.
</p>
        <p>
I'm including the final file at the end of this post. 
</p>
        <p>
A Yytoken class must be defined. This is the communication mechanism between the scanner
and the code you write and can hold any information you want (such as where in the
input the scanner is, state information, etc). An instance of this class is what is
returned by yylex() in the main scanning loop located in our code:
</p>
        <font face="Courier New" size="2">      Yytoken t;<br />
      while ((t = yy.yylex()) != null)<br />
      {<br />
         System.Console.WriteLine(t.m_text);<br />
      }<br /></font>
        <p>
Now that we have an input file to the scanner generator, we first generate the C#
code by executing C#Lex.exe on this file, then compile the generated C# file by invoking
csc.
</p>
        <blockquote>
          <p>
            <font face="Courier New" size="2">C#Lex english.lex</font>
          </p>
          <p>
            <font face="Courier New" size="2">csc english.lex.cs</font>
          </p>
        </blockquote>
        <p>
The input file (input.txt) has this line: Lorem ipsum dolor sit amet, consectetur
adipisicing elit, sed do eiusmod tempor incididunt ut "labore" et dolore magna aliqua.
</p>
        <p>
Running english.lex.exe gives us the following output
</p>
        <blockquote>
          <p>
Lorem<br />
ipsum<br />
dolor<br />
sit<br />
amet<br />
consectetur<br />
adipisicing<br />
elit<br />
sed<br />
do<br />
eiusmod<br />
tempor<br />
incididunt<br />
ut<br />
"labore"<br />
et<br />
dolore<br />
magna<br />
aliqua
</p>
        </blockquote>
        <p>
Our final file looks like this:
</p>
        <p>
          <font face="Courier New" size="2">using System;<br />
using System.IO; </font>
        </p>
        <p>
          <font face="Courier New" size="2">class WordExample {<br />
   public static void Main(string[] argv) {<br />
      Yylex yy = new Yylex(new StreamReader(new FileStream("test.txt",
FileMode.Open))); </font>
        </p>
        <p>
          <font face="Courier New" size="2">      Yytoken t;<br />
      while ((t = yy.yylex()) != null)<br />
      {<br />
         Console.WriteLine(t.m_text);<br />
      }<br />
      Console.WriteLine();<br />
   }<br />
} </font>
        </p>
        <p>
          <font face="Courier New" size="2">class Yytoken {<br />
   public Yytoken(string token)<br />
   {<br />
      m_text = token;<br />
   }<br />
   public string m_text;<br />
}</font>
        </p>
        <p>
          <font face="Courier New" size="2">%% </font>
        </p>
        <p>
          <font face="Courier New" size="2">ALPHA=[A-Za-z]<br />
WORD=[A-Z]?[a-z]+'?[a-z]* | ([A-Z]\.?)+ | ([a-z]\.?)+<br />
WHITE_SPACE_CHAR=[\n\ \t\b\012\r] </font>
        </p>
        <p>
          <font face="Courier New" size="2">%% </font>
        </p>
        <p>
          <font face="Courier New" size="2">&lt;YYINITIAL&gt; {WORD} { return(new Yytoken(yytext()));
} </font>
        </p>
        <p>
          <font face="Courier New" size="2">&lt;YYINITIAL&gt; \"{WORD}\" { return(new Yytoken(yytext()));
} </font>
        </p>
        <p>
          <font face="Courier New" size="2">&lt;YYINITIAL&gt; {WHITE_SPACE_CHAR}+ { } </font>
        </p>
        <p>
          <font face="Courier New" size="2">&lt;YYINITIAL&gt; [\.,\?!] { } </font>
        </p>
        <p>
          <font face="Courier New" size="2">&lt;YYINITIAL&gt; . { return(new Yytoken(yytext()));
} </font>
        </p>
        <img width="0" height="0" src="http://www.artofcoding.net/blog/aggbug.ashx?id=54f07327-ae99-4ba1-b234-aa7e29835f81" />
      </div>
    </content>
  </entry>
</feed>