Friday 3 January 2014

Code : Fastinput / Fastread Integers

Fastread (for online programming sites)

Some of you who might had been writing code for online programming websites like codechef often use scanf or cin inbuilt methods to get your input. This article that i am posting today is based on several arguments that programmers believe and is true to the most extent. I will be telling about gathering input from the stream in a faster way into variables which i am assuming here to be integer types. Later you will come to realize that with a small change in the code (by appending some if's and else's) you can do the same for floats as well. However i would be concerned with integers and their bigger counterparts only.

Getting that fast from input might save some of your time and is good when you are in a competition where the ranking is separated by the amount of time your code eats. You might want to make it tighter and fast enough to beat your opponents.

Also i used a word 'stream' above in my article. If you are not familiar with it follow this wiki link (i am sorry its from wikipd but its reliable for the moment ;) ).

Anyways, gathering input such as numbers can be done by gathering characters, and then internally manipulating them to form the number. Everything that is an input in the stream to your code is a character or a string. Hence apart from using scanf/cin you can go for opting string methods for input. One of which - getchar() - you might had probably heard about. Now the code that you are going to see below uses a specific targeted function called getchar_unlocked(). As the name would suggest it has something to do with locks. Every time an input source directs data into your code, all other input sources get notified and are locked i.e. they would not keep on producing input that while one of them in already doing it. Hence functions like getchar check for a lock on input streams that are pointing to the code. However in case when there is only one input stream (the stdin), it becomes useless to check for stream locks as there is just one source. Hence getchar_unlocked() function is designed to not to check stream locks assuming they are already open. We will talk about a feature called thread unsafe of getchar_unclocked() later.

Below is a pretty simple code that might convince you on this input method.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void fastread_int (int * a)
{
  char c=0; *a=0;  //first flush all previous data
  
  // now we shall eat all spaces before the actual num
  //as it would also count otherwise (remember ' ' has
  // also got an ASCII value, i guess 32
  while(c<33){c=getchar_unlocked();}

  //now we shall manipulate the number. i hope
  //you know the math 
  while(c>33)
  {
   *a=(*a<<3)+(*a<<1)+c-'0'; //why? :D
   c=getchar_unlocked();
  }
}
I hope the above code was clear and explainable. You can similarly create other functions which could accept long long int 's as well and functions that read a variable and return it so that you can avoid usage of a variable in your main code if not required. However this creates a problem. For a problem that includes more than one or two data types, your code would certainly look ugly. If you know C++ and have some knowledge of templates, you can avoid this code-jizz around you by shortening your function. You must had observed that all of them are doing the same task but only differ in data type, hence why not use a template. :D
template <typename DataType>;
void fasread_custom ( DataType * a)
{
  char c=0; *a=0; 

  while(c<33){c=getchar_unlocked();}
  while(c>33){*a=(*a<<3)+(*a<<1)+c-'0'; 
  c=getchar_unlocked();  }
}


As a matter of many asked arguments and queries on the relative speeds of input methods, a standard argument given is something like the one given below (you can also refer stackoverflow) :
cin (~=) < printf < getchar < getchar_unlocked
 This means that getchar_unlocked() takes the least time and is the fastest of them.

Two things i am gonna talk about :

 (+) The template code : Here the code seems very well and short and definitely is a good way to handle things. But dont think that this would be just one piece of code at runtime. At runtime, if you queried on a datatype 'int' and other as 'long int', both would maintain seperate spaces during runtime once they are created. This is something called as code bloat.

 (+) thread safe : Well this is a concept that comes from operating systems (if you had heard of mutexes and deadlocks and mutual exclusion principal). Standard streams lock mutexes to avoid access to their stream buffer and hence they are thread safe. However this increases the overhead of locking and mutual exclusion. _unlocked allows to avoid that overhead and is marginally faster.

I hope this article was good enough for some fastread enthusiasts. A small modification can give you results for strings and floats as well. Thats where the fun lies. Give it a try. シ

2 comments:

  1. What should be now written in main to take input with getchar_unlocked

    ReplyDelete
  2. getchar_unlocked( ) will always return a character from the stream. You can write the same in main too. Just make sure you receive this in a char variable and take ASCII values into consideration.

    ReplyDelete