Fabrice Canel, the Principal Program Manager of the Bing Index Generation team, posted their Sitemaps best practices guide for large web sites which shows off some pretty awesome numbers. For starters Bing can support up to 125 trillion links through multiple XML sitemap files. If you’re really curious with one sitemap file, Bing allows you to list 50,000 x 50,000 links, which brings you to a whopping total of 2,500,000,000 links (2.5 billion). The total size of your sitemap XML files are allowed to reach more than 100GB.
Bing however recommends you don’t list such a large number of URLs. It’s a rare case that Bing will index all of those URLs as often times Bing will focus on the more important URLs – and as such those should really be the ones you’re submitting. For extremely large websites you’ll always want to break up your links between the two allowed sitemap files to ensure that the Bing crawler successfully discovers all your sites URLs.
To mitigate these issues, a best practice to help ensure that search engines discover all the links of your very large web site is that you manage two sets of sitemaps files: update sitemap set A on day one, update sitemap set B on day two, and continue iterating between A and B. Use a sitemap index file to link to Sitemaps A and Sitemaps B or have 2 sitemap index files one for A and one for B. This method will give enough time (24 hours) for search engines to download a set of sitemaps not modified and so will help ensure that search engines have discovered all your sites URLs in the past 24 to 48 hours.
Feel free to head on over to the Bing Webmaster Blog for the full post.
website sitemaps can be drawn prior to creation by a sitemap software like creately.