Multithread HttpWebRequest hangs randomly on responseStream

2 posts, 0 answers
  1. Franco
    Franco avatar
    1 posts
    Member since:
    Jul 2014

    Posted 24 Jul 2014 Link to this post

    I'm coding a multithreaded web-crawler that performs a lot of concurrent httpwebrequests every second using hundreds of threads, the application works great but sometimes(randomly) one of the webrequests hangs on the getResponseStream() completely ignoring the timeout(this happen when I perform hundreds of requests concurrently) making the crawling process never end, the strange thing is that with fiddler this never happen and the application never hang, it is really hard to debug because it happens randomly.

    I've tried to set

    Keep-Alive = false

    ServicePointManager.SecurityProtocol = SecurityProtocolType.Ssl3;

    but I still get the strange behavior, any ideas?

    Thanks

    HttpWebRequest code:

         
    public static string RequestHttp(string url, string referer, ref CookieContainer cookieContainer_0, IWebProxy proxy)
           {
               string str = string.Empty;
               HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
               request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
               request.UserAgent = randomuseragent();
               request.ContentType = "application/x-www-form-urlencoded";
               request.Accept = "*/*";
               request.CookieContainer = cookieContainer_0;
               request.Proxy = proxy;
               request.Timeout = 15000;
               request.Referer = referer;
               using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
               {
                   using (Stream responseStream = response.GetResponseStream())
                   {
                       List<byte> list = new List<byte>();
                       byte[] buffer = new byte[0x400];
                       int count = responseStream.Read(buffer, 0, buffer.Length);
                       while (count != 0)
                       {
                           list.AddRange(buffer.ToList<byte>().GetRange(0, count));
                           if (list.Count >= 0x100000)
                           {
                               break;
                           }
                           count = 0;
                           try
                           {
                  HERE IT HANGS SOMETIMES --->             count = responseStream.Read(buffer, 0, buffer.Length);
                               continue;
                           }
                           catch
                           {
                               continue;
                           }
                       }
     
                       int num2 = 0x200 * 0x400;
                       if (list.Count >= num2)
                       {
                           list.RemoveRange((num2 * 3) / 10, list.Count - num2);
                       }
                       byte[] bytes = list.ToArray();
                       str = Encoding.Default.GetString(bytes);
                       Encoding encoding = Encoding.Default;
                       if (str.ToLower().IndexOf("charset=") > 0)
                       {
                           encoding = GetEncoding(str);
                       }
                       else
                       {
                           try
                           {
                               encoding = Encoding.GetEncoding(response.CharacterSet);
                           }
                           catch
                           {
                           }
                       }
                       str = encoding.GetString(bytes);
     
                   }
               }
               return str.Trim();
           }
  2. Eric Lawrence
    Admin
    Eric Lawrence avatar
    833 posts

    Posted 25 Jul 2014 Link to this post

    Hello, Franco--

    As far as I understand things, the object's Timeout property doesn't work like a developer typically might expect.

    Generally speaking, writing code that requires hundreds of threads is a bad idea as you lose a ton of time due to context-switching overhead. The best practice is to use the Async versions of the functions so that only a few threads are needed.

    Your use of List<byte> is also unusual; is there some reason you're not using a MemoryStream here?

    Have you adjusted the number of concurrently allowed requests from the ServicePoint?

    Overall however, I don't have any theories as to why you're sometimes seeing a request hang here. You might try enabling System.NET tracing, but it sounds like the volume would be quite high.

    Regards,
    Eric Lawrence
    Telerik
     

    Check out the Telerik Platform - the only platform that combines a rich set of UI tools with powerful cloud services to develop web, hybrid and native mobile apps.

     
Back to Top