New home


File Upload and Canonical Issues

Never trust the user input. The incoming data can be the source of many devils and a security flaw can be there just waiting for the right moment and the right person to break your application.
After finishing with the upload control I finally did the integration with the website. Now the users can select the files and send it to the website to be processed.
What are the security risks here? Something that can be called 'canonicalization issue'.
For a start all data can be seen on its canonical form. A canonical form is the most simple and most stardard form that any data can be represented, thus canonicalization is the process of converting the data to its canonical form.
Proficient JavaScript programmers are very aware of what I am talking about, and as a matter of fact in our system the user can search for a name using wildcards. So you can ask him: "Retrieve me a list of all the instances where its canonical form includes Bill as mandatory prefix" The user will probably say: "Retrieve what???" but if you ask them: "Give me a list of all the users where their names start with Bill" they will type in the system 'bill*'. The user normally does not know that but he is doing is performing a 'type of canonical query'.
Now, back to our file upload issue. A file name is a very common canonical type. You can call the same file as:
  • thairecipes.doc
  • c:\recipes\thairecipes.doc
  • c:\\recipes\\thairecipes.doc
  • c:\   recipes\thairecipes.doc
  • c:%3A%5Crecipes%5Cthairecipes.doc
As you probably figured the last one is the issue. Your Windows operating system will recognize the symbols %5C and %3A.
You see now because we are giving to the user the option to save in our system just about any file name he wants to at the same time we are also opening a door for a sort of canonical attack. Remember : Never trust the user. And by user I am not only talking about a person. In our context an user is any entity who uses a given resource or service, and for that matter an user indeed can be another system or another application.
A hacker would think: "how can I break into this site? Does it allow any easy access to any of its resources?". In our case, yes our website must allow the user to upload files.
What to do now? How to handle a file upload to a web server?
Well, first as a general rule you must not design a website that accept just about any file names created by the user and save it like that. As a matter of fact, any input must be validated and sanitized  if possible, not only in client-side but on the server-side as well.
A better design: Do not allow the user to save the file in the web server with the filename that he wants to use. Accept the file, keep the original filename somewhere and let the application rename that file with another name and then save it. I would suggest you to use a GUID string for that matter. That way you are not only closing the doors for a possible canonical attack but also you do not give a chance to a malicious user to try to find out the filenames you might have in your server. For example, If a hacker knows that there is a file called http:\\mywebsite\mydocs\clientid1\file1.doc he will try something like http:\\mywebsite\mydocs\clientid1\file2.doc, and then http:\\mywebsite\mydocs\clientid1\file3.doc and so on. By using an internal name rule creation you minimize his surface.
Another thing to observe: You don't have to fight against and defeat a malicious user, probably there can be hundreds of hackers trying to break your code and you are just one guy against them ( and you don't want to have any sleepless nights during weekends, do you? ) They always find a way to break your code. The best option is to minimize their attack surface. Chances are they are going to move on and concentrate their efforts to break a "weaker website" if your site if strong enough for the first rounds of attack.

These would be some instinctive considerations and additionally I would suggest to take a look at implementing File I/O guidelines as well. At the end of the day, it all depends about how secure you want to be, how much time you have available to implement it and how rigid the specifications were given.
See you later.



Uploading files and bubbling events

This week while I was awaiting for a new project confirmation I saw myself doing some code fixing for another project, that's something quite common and I really don't mind that. One of my tasks was to develop a little page to upload images to a server. That's not a problem, so I decided to try something new. Yes, I was very keen to use Silverlight for the task, but I used some good old javascript and a very cool concept similar to the called bubbling events, which is basically to raise an event from a determined and let it propagate up in the chain until it is captured.
My first idea was : I will create a web user control that can be easily added to a common web page and this guy will pass the list like a bubbled event.
This control will implement an interface to select and hold a list of files and will be able to send that list to the page and notify the page the list is ready to upload. Let's call it MyUpload control.
being a user control any programmer can then drag and drop this guy to any page and it must be atomic enough to not be too attached to the parent. So it inherits from usercontrol, and will have, at least for now, 2 methods called upload and a click. Here the things are getting cool. This click event is an event raiser.
When a programmer use this control, by calling the click method it will pass the selected file collection to the page. How? sending the list as arguments of the call.
The upload method must then be responsible populate the list and send this list to the caller.
Once the event is triggered in the webpage this page will loop through the items in the received file list and for each instance I will 'save file as...' and provide the path in the web server directory. This will move the file to the desired location as specified in the document request.
Could I save the files in the database? Yes, I could but all the specs for the task are done and that was the mission. So I won't discuss any matters of  a better solution for this given scenario.
My upload control will have two buttons add and remove files which will be responsible for maintaining the list of files I want to upload. Why is that? because we want to upload a list of files in one shot and not one by one. Indeed, I'll spend more hours implementing this solution but at the end this will be a reusable control and as I said, any other programmers will then just drag and drop it should they need to implement a similar solution.
How do implement the code for add and remove files? That's the most difficult part and requires a good amount of javascript knowledge. Just this javascript routine can be subject of a post by itself. For now let's say the control has a private method called buildjavascript which renders in the page all the javascript methods necessary to maintain this object list.
Returning to our code behind, this list will be a collection and the best way to keep it and pass it by posting is to make it inherit from a very cool and useful class for this situation: the HttpFileCollection.
The upload button in the control will call the upload method which does nothing but raise the event to the page. The page who has the control implements already the method expected to parse the list as an HttpFileCollection object. Now, we only have to save the object as new name, being in a new location.
My initial design is done and I am doing the javascript side of things. This week was hectic but once things come back to normality and some time is left for me, I will share my code here somewhere.
See you later.



Stored procedure takes longer run SQL2005 than 2000

Here there is something interesting that I would like to share with you guys.

Even thou Microsoft SQL Server 2005 is out for quite some time, it is still common to see people working in projects using Microsoft SQL Server 2000 and often in mixed environments.

That's the case I want to talk about: The mixed environment, and I am working in a project where some applications have that hybrid configuration.

So someone told me that my report developed in .NET 2.0 was running slower than the similar one done in the old fashioned ASP. Of course I denied, just to see later the proof I was wrong.

Yes, the same stored procedure executed from the same page from, in the same machine was running faster in the old environment while it was slower in the new (and supposedly improved) environment. How's that possible? I traced the execution, used the SQL profiler but nothing gave me a good clue. Than I found this in the Microsoft website.

In SQL Server 2000, the execution plan for the query uses an Index Seek operator. In SQL Server 2005, the execution plan for the query uses an Index Scan operator. The optimizer produces an index spool for the Index Scan operation. When you use the FORWARD_ONLY cursor, SQL Server scans the index for every FETCH statement. Each fetch takes a long time. Therefore, the query takes a long time to execute.

See that example below:

50 declare @p1 int

51 set @p1=0

52 declare @p3 int

53 set @p3=16388

54 declare @p4 int

55 set @p4=8194

56 declare @p5 int

57 set @p5=0

58 exec sp_cursoropen @p1 output, <Transact-SQL statement> ,@p3 output,@p4 output,@p5 output

This code will run faster if you are NOT using the .NET 2005 SQL Connectors or running in a SQL Server 2000. Here we are using the sp_cursoropen to open a cursor, then specifying the forward-only option in the parameter list.

This is a bug you can only experience if you are using a lot of cursor-based stored procedures from a SQL 2000 to a SQL 2005 environment, and here we have a VERY HIGH cursor usage. (not that I like them neither I defend its usage, it is just a fact from the environment here)

How to fix this?
If you do not want to download and apply the patch and want to fix this in the code itself use "OPTION (FAST 1)" in the stored procedure call. That will make it run faster in the SQL 2005 machine. Otherwise download here and here the patches.

See ya later



TeamSystem and TFS add-on to Count Lines of Code and Predict Errors

Here another cool thing. From the Microsoft download website you can find this:

"Microsoft IT partnered with Microsoft Research to create a VSTS 2005 extension that counts lines of code and predicts system defects. In the software development environment, insight into the volume of code being produced, and the changes applied to that code, provide measurements of productivity and quality. The Line of Code (LOC) counter provides a flexible and extensible framework for automating the LOC counting process."

Isn't that amazing ? Finally a cool software metric add-on for Team System, and if you wanted another reason to stop using SourceSafe and start using Team Foundation System guess what: It also works with TFS.

If you use Team System or TFS, get it here.

See you later.



How to Make Productive Project Meetings

Project Meetings can be very productives but also can be a real waste of time and money.
Recently while working on a client where I was responsible to have a project development meeting as meeting coordinator. The group of participants were an heterogeneous group and despite the fact that I did not know some of the atendees, the meeting was a big success.
During a conversation on our coffee break I was asked about meetings strategies and how to conduct them.
So I am going to share with you guys here what I told them, and what I effectivly did during that particular meeting:
  • Every meeting MUST have 3 elements: purpose, agenda and maximum duration. If any of these items is missing, the meeting is meaningless and should not happen.

  • Make sure you are able to define a purpose for the meeting in a maximum of 2 sentences, for instance:"This meeting is to plan the new developments for the project X". This way, everyone will know why they are there, what needs to be done and how to proceed in order to well-succeed.

  • Define a clear agenda in advance. Make a list of all the items to be discussed, revised, analysed, displayed etc. When I conduct meetings, my personal strategy is to allocate a time limit for each item in the agenda and to assign the responsability to lead the discussion to someone in the group. Works as a charm.

  • Define a duration for the meeting, how many minutes/hours it should last. From the start make crystal clear to everyone what time the meeting will start and, sometimes more importantly, when it will end. It is amazing the number of managers who have absolutely no control of their meetings and do not know how to enforce the finishing rule. If you think you have this habit...CHANGE THIS !!!

  • Do not wait for the delayed people. Meetings must start on the agreeded time. Do not wait about late arrivals. Do not wait for those who need to be called for the meeting. You just make sure everyone gets notified, then when someone arrives after the meeting have started, DO NOT STOP TO REVIEW WHAT WAS SAID. Do this as a proof of respect to those who arrived on time.

  • If the meeting's organizer is late, Consider the meeting cancelled, and get back to work. How long is considered late? Depends on the company, but I would not wait more than 5 minutes.

  • Document your meeting. What I do is to put someone in charge of writing down the notes. What to put in the meeting notes? Basically the name of the attendants, the discussed subject, the agreed points, the next developments and/or actions with dates and their respective responsibles.

  • When the meeting is over - do not wait more than 24 hours - the meeting notes must be sent to: All the participants, to those who could not make it to the meeting and to those who might be influenced by upcoming decisions.

  • Keep the focus. Every meeting must have a regulator to notify the others when someone is discussing any subject outside the scope of the current topic. Ask one of the presents to volunteer for this task when the meeting is about to start. His/her task is to interrupt the meeting at any given time when the focus is lost and bring back the main subject. This new outside topic can maybe then be noted and even can be discussed in future meetings. In case of doubt regarding a specific topic being in or outside the scope, the meeting organizer has the final word.
I hope these notes can be of any help in your next meetings. If you have any comments or other meetings ideas, please feel free to leave them here and share as well.
See ya later.