April 2021
Mo Tu We Th Fr Sa Su
<< >>


Website Closures

Posted on Mar 31 2005

When I started working on Dinoparc I looked at several webscript languages and ended up implementing my own (see previous log). It works now very fine and very efficiently since both CPU and RAM are very low on the server (compared, of course, to PHP...). I had fun working on such a new technology and if I have another website to make I'll definitely push it further.

One thing I would like to add to this language (it's Motion-Types but with MotionScript generator and VM) are <strong>closure-links</strong>. What's that ? When you generate dynamically a webpage, you generate a lot of links to other pages with some parameters embedded. Such parameters (http GET parameters) are really to be care because they can be used in a tricky way. One example I remember on a website we wrote was a amount of goods you wanted to sell. If you were changing the value to a negative one, you ended up loosing money and getting goods for cheap, since after some time, money was locked to 0 and you could get an infinite number of goods this way, and then sell them back normally for money. The fix was quick : just forbid negative values. But in fact, since there was 3 links available "sell 1, 5 and 10", the value could as well have been hidden from the user, if not for (good) coding practices.

In all the cases, what you have to do to write good and secure website is to check all incoming parameters for consistency against the ones that were actually passed as parameters by the previous link. Such thing can be automated by a CRC. That means the link index.php?v=33;c=hello can be added an additional parameter crc that ensure that v=33 and c=hello. However any CRC can also be forged by the user, so this approach will not work (unless you try to hide the algorithm and hope nobody find it).

Another way is to use closures. In functional programming languages, a closure is a function (or a method) for which a given number of parameters have been already decided. For example let's take a simple function :

  function plus(a,b) {
    return a + b;

A closure plus(1) could be called "increment" since it will be a function that accepts one parameter "b" and adds 1 to it. A <em>partial application</em> (calling with less that the total numbers of parameters) of a function produces a closure in most of functional languages (this feature is called currying).

What closures does have to do with html links and parameters checking ? Well, if you consider every link as a function which takes some arguments (the parameters), then you can see that every followed link is actually executing a function, which generate a webpage (this analogy is particularity true if your website is a single application such as Dinoparc, and not a bunch of php scripts). So one possibility is, for every link that the user <em>might</em> follow, we can capture the name of the "function" and all the parameters into a closure, assign an unique id to it, and just give a link to a script executing the closure with the given id.

For example, if there is a link to a php script someScript.php?p1=V1;p2=v2 , you create a closure with "someScript.php" and parameters "p1=v1" and "p2=v2", give it an id (for example 456) and return instead a link to execute.php?id=456 . This way the parameters are hidden from the user which cannot hack them since the only thing he can change is the closure id. That gives us a good tip on how and where to store closures. Since the user can change closure id, he should not be able to access other people closures, so we have to store the closures into the current user session.

Let's think about it : every time I generate a page, I look at links with parameters, create closure for each of them, and replace the link with "execute this closure by id". Closures are very small objects. They can even be a single couple (id,string) if you can parse later your URL and call the corresponding script. But there is a bunch of them... Every page might contains more than 20 closures if there is a lot of links having parameters. That means after 10 pages visited, you have something like 200 closures in your session data. Too much, you might think... One first optimization is to reuse closures ids which already have been generated. If one page have a link "run.php?cmd=echo" and one another page have the same link, then obviously following the first or the second link will result into the same result (for a given user) : that might reduce the number of generated closures, especially if your website have fixed menus with parameters that are showing on every page.

Another property of what I would call a "solid" website is the ability of the user to go "Previous" pages using his browser without annoying side effects. For example if I click the "buy" link and then later go back several times, I don't want to buy a second time. Technically speaking, this is achieved by having a two-steps execution : at first do the side effects, but don't generate any page : instead send a "Location" http header with url (that can itself be a closure) that will show the result page. This way if the user comes back his browser will directly redirect him to the result page and will not execute again the "buy" action.

Giving the ability of your user to go back is nice, but you might limit the number of pages in history, in order to keep the number of closures in session low. If you limit the history to 10 pages then it should already be quite enough and still keep the size of the session data small. Another way is to use closures explicitly in your program so only sensible links you want will be turned into closures. But that might be risky since you can never guess what an user can do.

Please note also that data coming from POST forms cannot obviously be protected with closures abstractions, since such data actually comes from the user. But in most websites, that's quite normal and easy to check incoming forms for correctness.

0 comment
Name : Email : Website : Message :