Menu
Is free
check in
the main  /  the Internet / What is a sitemap file and how to add. How to create Sitemap XML for Yandex and Google: Step-by-step instructions

What is a sitemap file and how to add. How to create Sitemap XML for Yandex and Google: Step-by-step instructions

With our site map generator, create XML files that can be presented in Google, Yandex, Bing, Yahoo and others. search enginesoh to help them index your site.

Make it for three simple steps:

  • Enter the full URL of the website in the form.
  • Click the Start button and wait until the site is fully scanned. At the same time you will see the total number of working and non-working links.
  • By clicking the "SiteMap.xml" button, save the file in a convenient location.

  • Sitemap is a site map in XML format, which in 2005 the Google search engine has become used to index the pages of sites. The Sitemap file is a way to organize a website, identifying an address and data for each section. Previously, the site's cards were mainly sent for users of the site. The XML format was designed for search engines, which allows them to find data faster and more efficient.

    The new Sitemap protocol was developed in response to increasing the size and complexity of websites. Business sites often contain thousands of products in their catalogs, the popularity of blogs, forums, bulletin boards forcing webmasters to update their materials at least once a day. Search engines are increasingly difficult to track all the material. Through the XML protocol, search engines can more effectively track the addresses, optimizing their search by posting all information on one page. XML also shows how often a specific website is updated, and records the latest changes. XML cards are not a tool for search engine optimization. This does not affect ranking, but this allows search engines to make more accurate ratings and search queries. This happens by providing data that is convenient to read by search engines.

    General Recognition of the XML protocol means that websites developers no longer need to create different types Site cards for various search engines. They can create one file to represent, and then update it when they have made changes on the site. It simplifies the whole process. thin setting and expansion of the website. The webmasters themselves began to see the benefits of using this format. Search engines rank pages in accordance with the relevance of the content of specific keywordsBut before the XML format often the contents of the pages were not properly represented. This often upsets webmasters who understand that their efforts to create a website were left unnoticed. Blogs, additional pages, add multimedia files take several hours. Through xML file This watch will not be wasted, they will see all the well-known search engines.

    To create your SiteMap in XML format and keep search engines in the course of all changes in your site, try our free generator Sitemaps.

    About which I somehow talked on my example. And its placement also does not cause questions, as it should be in the root catalog of your site './'. The only questions that are connected with it is why it is needed to work my site and how to create this file. What's next and let's talk about.

    Why do you need a sitemap.xml file

    In general, I see it:

    Screenshot of the fragment of my file sitemap.xml.:

    This file creates a blog card or site with all its pages on the similarity of the one that some do to the list of their articles. Only the only moment this file is not needed for users who have come to your site, but for search engines. Moreover, popular search engines themselves recommend creating and implementing this sitemap.xml sending it to them. And all this only needs to be used to understand search engines about Web pages that are available for scanning on your site. Again, unlike robots.txt that prohibits some sections or pages - Sitemap.xml creates a list of pages (URL links) that must be found in indexing.

    The file itself is an XML document, where links, addresses of your Web site and plus some more you need for search engines are listed. Such, for example: Date last change Pages, change frequency, priority. Again, all this is necessary only for PS for more competent scanning of your site. You can also see the file in principle on other blogs if you enter their address, and then / asitemap.xml or as I have.

    In general, this file should help search engines, determine the location of pages on your site or blog for more reasonable indexation. But remember that this is just an additional prompt for search engines. And if this file is not on your server, this does not mean that search engines will not index your pages, and they will not get into the search. Everything will be, but with this type file is better.

    Creating a SiteMap.xml file

    As I understand it, in the question of creating this file you can go in different ways. The first is to create this file manually using the recommendations and examples from the official site Sitemaps.org using XML tags. Saving a file with the XML expansion in the encoding. After that copy it to the server in the root directory of your website. By the way, this file has some size limitations - no more than 10 MB and 50000 URLs. Although there is a solution if you need to make a file with a big list of URL.

    There is another second option, it takes automatic creation Sitemap cards. There is a lot of Web sites on the Internet that provide such a service. For example, the site htmlweb.ru - which has a Sitemap generator where you will only have to enter the full address of your site and click on the download button XML card. After that, the finished file is to save on your computer and move yourself to the web site in the root directory.

    By the way, after you download this file to the server, it is also necessary to register in the Robots.txt file at the end - the link fully to the sitemap.xml file for example as I have 'Sitemap: https: //www..xml'. This is necessary to report to the search engines the location of this file.

    The answer is obvious - this file must be.

    Sitemap.xml file, as well as the usual site map is a set of pages decorated in XML tags. Via this file, Search engines understand which pages of your site should be index first.

    Normal HTML Site Map:

    Sitemap in XML format:

    Each option has its pros and cons. Key advantage Maps of the site in XML format - eliminating the possibility of blurring the static weight of the pages inside the site.

    In this article, I will tell you how to make a correct Sitemap.xml file.

    If you already know everything about the compilation of Sitemap, immediately go to the last chapter, which is called "chips".

    1. Sitemap.xml file directives

    There are 3 mandatory directives that must be filled in any case, these are tags:

    • < urlset>
    • < url>
    • < loc>

    As well as 3 optional tags:

    • < lastmod>
    • < changefreq>
    • < priority>

    Here is a memo to decipher each tag from the official site http://www.sitemaps.org/ru/protocol.html:

    Attribute

    Status

    Description

    mandatory

    Encapsulates this file and indicates the standard of the current protocol.

    mandatory

    Parent tag for each URL. The rest of the tags are subordinate to this tag.

    mandatory

    Page URL itself. It always starts with the prefix (for example, HTTP) and end with a slash (if the server of your site requires it).

    ATTENTION: The length of the URL should not exceed 2048 characters.

    optional

    Date of the last file change. Specifies strictly in the W3C DateTime format. It allows, if necessary, do not take into account the time segment and use the Format GGYG-MM-DD.

    optional

    Allows you to specify how often information changes on the page.

    This value is set approximately.

    Valid values:

    • always
    • hourly
    • daily.
    • weekly
    • mONTHLY.
    • yearly
    • never

    If the page varies each time it is opened - we use the "Always" value (always). If this is an archive page - put Never (never).

    Note that this attribute serves for the search robot tip, and not rule. Therefore, the dependencies between it and the actual frequency of the search robot page are nonlinear.

    optional

    Allows you to specify the priority of some pages of your site before others.

    Range of values \u200b\u200b- from 0.0 to 1.0.

    By default, each page is given to 0.5 priority.

    The attribute value works to compare page priority only inside your site. That is, it does not affect the comparison of your site with the websites of competitors in the search network. Moreover, to set all pages the maximum priority is also meaningless. Because then the values \u200b\u200bwill be the same for the robot and the attribute simply will not work. So do not look for lasers, and specify objective priorities for pages.

    Save this memo. It will be useful to you at first. It is worth noting another plus of the XML site cards - this is flexibility. Flexibility is to combine different optional directives.

    Now that you have clarity what is an XML site map and you learned the main directives of this file, you can go to its compilation.

    2. Drawing up the SiteMap.xml file

    You can create a map of the site in 3 ways:

    • Manually;
    • Automatically using special services;
    • Automatically, using ready-made solutions in the form of plug-ins to CMS, etc.

    The process of cooking the site map is as follows:

    • We make a map of the site in one of the ways listed above;
    • We check on validity using search engine services (https://webmaster.yandex.ru/sitemaptest.xml);
    • We place the file on the site;
    • Specify the path to the site map for search robots in the Robots.txt file (by the way, there is a separate article about);
    • Indicate Sitemap in the Yandex and Google webmasters panel.

    So how to make a site map file?

    We will analyze an example of making a file manually. If you want to add 5 pages of your site to the site map:

    This should look like a site map in XML format:

    http://site.ru/url-o_kompanii/

    http://site.ru/url-uslugi/

    http://site.ru/url-produkty/

    http://site.ru/ull-dostavka//

    http://site.ru/url-kontakty/

    If necessary, add optional tags from the memo, which I led above. Additional tags are prescribed in a container , after specifying the URL page of the page in the tag https://mkr-novo2.ru/en/. For example:

    http://site.ru/

    2005-01-01

    mONTHLY.

    0.8

    The above code indicates the search engine that the page http://site.ru/

    the last time changed on January 1, 2005. Updated with a frequency of once a month. And the priority of this page 0.8 (maximum possible 1).

    Especially convenient features of the backlight of pair tags.

    Sitemap generation services

    In case you have a lot of useful pages on your site and you do not want to spend time on making a file manually, then you will be helped by the following services:

    There are a lot of such services. I use https://www.xml-sitemaps.com/.

    I will explain in brief all the settings:

    Plugins for CMS.

    There is a huge number of plugins and ready-made solutions for site management systems. For example:

    Plugin

    Validity

    After creating a site map you need to check it for errors. To check, use the service https://webmaster.yandex.ru/sitemaptest.xml

    After successful check, we fill our file to the site.

    File location

    Unlike the Robots.txt file, the sitemap.xml file can be located anywhere in your site. For example, in the root folder of the site, the file will be available at the following address:

    If you have placed the file in the folder / Files /, it will be available at this address:

    After successfully downloading the file, be sure to specify the search robots how to find this file. It is done very simple. In the robots.txt () file in the Sitemap directive, suck the full address to the file. For example, the file robots.txt may look like this:

    Important! Unlike robots.txt files Sitemap may be several. In this case, you must specify in Robots and webmasters a full address to all Sitemap files.

    Search Console and Yandex.Webmaster Panel

    The last stage remained. Specify the path to the site map in the webmaster panels of search engines.

    It should be noted the restrictions in the XML site map:

    • In one file, you can specify no more than 50,000 URLs
    • The weight of the file is not more than 10 megabytes (search engines do not index documents exceeding the size of 10 megabytes). If necessary, the file can be squeezed using an archiver .Gzip.
    • File encoding only in UTF-8

    This is the main stages completed. Follow all the simple rules described and you will not have errors. The second part of the article is devoted to more detailed configuration, subtleties and features of sitemap.xml These knowledge will be required to you to compile a professional site map for online stores.

    3. Grouping files Sitemap

    In case of exceeding the limit in 50000 URLs, you need to use the embedded structure and create a group of several sitmap. That is, create site maps in the site map!

    For a regular site (not a large portal or online store) such volumes of pages are rarity, therefore Sitemap grouping Most SEO specialists are used for convenience, for example, for grouping pages of goods or partitions.

    The syntax looks like this:

    http://site.ru/sitemap1.xml.gz.

    2004-10-01T18: 23: 17 + 00: 00

    http://site.ru/sitemap2.xml.gz.

    2005-01-01

    Definition of XML tags:

    Attribute


    Description

    mandatory

    Encapsulates information about all Sitemap files in this file.

    mandatory

    Encapsulates information about a separate Sitemap file.

    mandatory

    Specifies the location of the Sitemap file.

    not necessary

    Specifies the time of changing the corresponding SiteMap file. The robot uses this information to understand what SiteMap files have occurred. Indirectly, this tag allows the robot to detect new page pages faster.

    This includes a grouped site map. All other procedures are the same as described above. Do not forget to specify in the robots.txt file, in the Sitemap directive, the correct link to your file.

    4. Research

    At the end of 2014, I conducted a slight study on the analysis of the effectiveness of the SiteMap.xml file on the site.

    There was a problem of indexing a commodity group on the website of the online store (goods of about 10,000). At the same time, nothing obstructed their indexation. A site map file was compiled, consisting only of references to site products. The site map was updated automatically. For 2 months, more than 70% of URLs appeared in the Yandex search engine index base. A large proportion of the pages in the index are pages from the site map. I note that in this period there were no other measures to accelerate the indexation of the site (for example, reference).

    Here are the results:

    Conclusion: The site map still affects the indexation of your site. You must regularly update and update this file.

    5. Chips

    So that the article has not turned out boring for experienced SEO specialists, I suggest you familiarize yourself with the following "chips."

    Picture Sitemap

    For attraction additional traffic From search engines (perhaps not completely conversion) You can make an additional sitemap for pictures.

    Syntax for the site picture card looks like this:

    xMLNS: image \u003d "http://www.google.com/schemas/sitemap-image/1.1"\u003e

    http://example.com/primer.html.

    http://example.com/kartinka.jpg.

    http://example.com/photo.jpg.

    Memo on XML tags:

    Necessarily?

    Description

    Contains all information about one image. Each URL (tag ) may include up to 1000 tags .

    URL image.

    In some cases, the Domain of the URL image may differ from the domain used by the main site. If both domains are confirmed in Search Console, there will be no problems. But if the pictures are accommodated using the content management system, for example, google service Sites, you need to confirm the hosting site in Search Console. In addition, the robots.txt file should not prohibit the content scanning that needs to be indexed.

    Signature to the image.

    The shoot place. For example, Poronaysk, about. Sakhalin .

    Image title.

    URL license image.

    Lifehak for those who read

    Many SEO specialists generate Sitemap files once at the project start. Next about the site map forget. Pages indexed - good. No - and what to do?! New pages generally forget to contribute to Sitemap.

    During his research, I found out that the most in a convenient way solutions to this problem is separate file Sitemap.xml, in which there will be only those site pages that have not yet come to the index.

    And it was precisely 70% of new URLs to the Yandex index.

    From this article you will learn how to create a Sitemap file and provide Google access to it.

    Creating and sending Sitemap files

    File formats Sitemap

    Google supports several sitemap file formats described below. In all formats, use the standard protocol. Google currently does not support the attribute In Sitemap files.

    The following limitations are valid for all formats: the SiteMap file may contain no more than 50,000 URLs, and its size in the uncompressed form should not exceed 50 MB. If the file volume or the number of addresses listed in it exceed these limits, break it into several parts. You can create a Sitemap index file, enumerating all the Sitemap files in it, and send them to Google at once.

    Text file

    If there are only page addresses in the Sitemap file, you can send Google ordinary Text file with these URLs (one in each line). Example:

    Http://www.example.com/file1.html http://www.example.com/file2.html

    • You must use the UTF-8 encoding.
    • The file should not contain anything other than the URL list.
    • It text file You can give any name, but you need to use extension. TXT (for example, sitemap.txt).

    Google sites

    If the site was created and confirmed using the "Google Sites" service, the Sitemap file is created automatically. It cannot be changed, but you can send to Google to receive information for reports. Please note that if one subdirectory contains more than 1000 pages, the Sitemap file may be displayed incorrectly.

    • If your pages are posted on Google sites, the Sitemap file must be located at http://sites.google.com/site/ VashSait. / System / Feeds / Sitemap.
    • If the site is created using Google Apps. The URL of the Sitemap file should be like this: http://sites.google.com/ Vashdomen /VashSait. / System / Feeds / Sitemap.

    Expansion files Sitemap

    Google supports the advanced syntax in the Sitemap file for the information below. With it, you can add a description of video, images and other content to improve its indexation.

    Recently, it is often necessary to answer questions related to Sitemap files. A splash of interest in this far from the most important aspect of the optimization of sites is explained by way out. new version Yandex.Webmaster who marks the lack of a site map as an error.

    In chapter " Possible problems»Show the following notice:

    No sitemap files used by the robot
    The robot does not use any SiteMap file. It may negatively affect the speed of indexing new pages of the site. If the Sitemap's correct files are already added to the processing queue, the message will automatically disappear with the start of their use.
    Note the "Sitemap files" section.

    Is it scary this warning? Does it be necessary to create a suitemap and if so, why? Spread on the shelves.

    What is Sitemap and what is it intended for?

    The most often used XML format, which allows you to specify in addition to the URLs themselves also some of their characteristics (update and change frequency, relative value of the page). However, you can use a completely simple structure. This is the TXT file containing the URL list (each from a new line) - and nothing else.

    Assigning a list - to provide search spoons information on the documents on the website. It helps the robot to find all the resource pages and add them to the search results. Additional data from XML is a recommendation to search spiders more often, or less often attend certain pages. By the way, I did not hear anything about how these recommendations are performed. It may very much that they are not taken into account at all or are a much weaker signal compared to other factors.

    Myths about the site map

    1. It is enough to do sitemap.xml - and you can not worry about indexation

    This is the most frequent misconception, come across regularly. In fact, for large sites (hundreds of thousands of pages) fullness of the index - one of the most important problems and simple placement of the site map is not solved. Map gives robot the ability to find All pages, but this does not mean that a) indexing does not prevent something else, for example, technical problems and b) the search engine detect the page "worthy" to be in the search.

    2. Sitemap required for all sites

    Small projects (up to thousands of pages) with a more or less adequate structure when you can go to any page in a couple of clicks, feel great and without it. This is understandable of both of the general considerations (the main mechanism for finding materials for indexing is transitions on internal links) and from practice. We saw dozens of sites without a card, which were completely correctly perceived by the robot.

    Finally, the same says Google in his help:

    If the file pages are correctly connected to each other, search robots can detect most of the materials. However, using the SiteMap file you can optimize the site scanning, especially in the following cases:

    • Site size is very large. Google search robots can skip newly created or modified pages.
    • The site contains a large archive of pages that are not connected with each other. So that they are successfully scanned, they can be listed in the SiteMap file.
    • Your site has been created recently, and there is little links to it. Googlebot robot and other search robots scan the Internet, moving along the links from one page to another. If there is little links to your site, it will be difficult to find it.
    • The site uses multimedia content, it is displayed in Google news or uses other annotations compatible with Sitemap files. From SiteMap files can receive additional information to display in the search results.

    3. If you delete a page from Sitemap, it will fall out of the index

    Similar to the myth. Faced S. huge number Sites where Sitemap fell off due to technical problems or was given to robots in a strongly trimmed form. It could harm the search for new pages, but everything was fine with the old ones.

    On the contrary, it is often used to receive "delete from the card all indexed to focus on the robot on new pages." It gives a certain effect in terms of optimization of the site (scan) of the site. However, for most cases, I do not recommend using it, see the reasons below.

    4. Be sure to configure all add. Parameters (priority, update frequency)

    Not. As mentioned, you can easily use the usual TXT file with the URL list. Of course, worse from the indication of the maximum of information in the map will not. But:

    1. There are no reliable data that search engines really take into account these instructions. In fact, Yandex often ignores even a more stringent recommendation - the Last-modified server header and if-modified-since (see).
    2. Even if the signals are taken into account strictly according to the statements of search engines - that is, as a recommendation, the gain in the efficiency of scanning will be most often very insignificant. An exception is truly where the completeness of the index is critical.
    3. Note all data requires an additional painstaking work on the selection of their values.
    4. Similarly, setting up a file generation with all parameters is additional development costs.
    5. Paragraphs 3 and 4 are even more serious than it seems. After all, the site changes, and extended data should change, otherwise the recommendations will become irrelevant.

    I think this is enough about myths, we turn to this recommendations.

    How to work with Sitemap?

    Most of the necessary information about creating files and providing access robots to them is contained in the search engine help. See Google and Yandex help. I will tell you about several unlohabited moments.

    First, the file with the list of the URL of the site, to which it is easy to access, can be useful not only to search robots. It is extremely convenient for a number of SEO analytics tasks.

    A pair of examples.

    Evaluation of the fullness and quality of the index

    Once we definitely know the number of pages available for search engines (the number of links in the map is easy) - it means that we can quickly appreciate how fully it is indexed. We make a rough estimate through the "Site:" operator (better with some tricks, see).

    If the number of pages is less than in the map - we find those that escaped from robots and drive them into the search - editing structure, twitter (), etc.

    If more - then the search could get randomly generated, "trash" pages. They need to be found and either bring to mind, or close with robots.txt, canonical, meta tags. Again, to search for unnecessary it is useful to the list of necessary than and is Sitemap.

    Search pages that do not bring traffic

    If the page is on the site, but does not bring us visitors for a long time, something is wrong with it. Such URLs need to find and deal with the reasons - often it helps to raise traffic.

    How to do it? At least like this:

    We are building in a metric report on the entry pages from the search for the quarter:

    Filter by the source - according to one of the search engines that we work as:

    And unload the list of pages (Table data) in Excel.

    Now we have left:

    a) overtake the XML card in Excel (for this there is a sea of \u200b\u200bonline converters).

    b) Using excel functions We find the URL that in the column from the card, but not in the column from the metric.

    The algorithm looks pretty cumbersome, but nothing complicated in it. And for the lazy values \u200b\u200bof their time (such as I) there is automation of the process. One of the reports of my site analysis service is just. For example, yesterday came out, where, based on Sitemap, the search for potentially dangerous pages, which are driven into the reference spam index from competitors who want to harm the site.

    There are many more examples. The essence is not in them, but in the fact that to have a current list of site pages is very helpful. It is possible to quickly access it using different services and software (at the extreme case - using Excel) and use in the site optimization process.

    Yes what to say there, even a standard technical audit with use to make it more convenient if you do not use the URL main pageAs usual, and Sitemap. The process is done more manageable: You can select a part of the pages in advance (for example, by the problem selection itself) and do not wait until others are processed.

    It was the first not very obvious moment.

    How best to give robots access to the map?

    In some cases, it is better not to specify the link to Stiemap in robots.txt, and manually send through Google Search Console and Yandex.Vebmaster. The fact is that in robots.txt it can see anyone. For example, some kind of villain who is looking for where to spark content. Do not make it easier for him.

    If you give SEO-paranas to raise even stronger - then it comes to not use standard name (sitemap.xml), but to name the file somehow otherwise so that it is not found, entering the traditional name.

    I will not say that this is a particularly critical council, but why not attend straws, if it is easy?

    Summary

    1. The Sitemap file helps the indexing site, but is not a panacea. If there are problems with the fullness of the index - they need to be solved comprehensively.
    2. Use optionally, but preferably for large sites and some specific tasks (see the above quote from Google help).
    3. The previous item is fair regarding the task "facilitate the life of search robots". However, for the tasks of analyzing the site and making decisions during the optimization process, it is convenient for all sites at hand (except for very small).
    4. The most important requirement for Sitemap (in addition to compliance with standards) is completeness and relevance. For SEO analytics tasks, the card acts as a benchmark with which other lists of URLs are comparing (those in the index; on which there are incoming links; which are transitions from search and so on). Therefore, when creating you need to immediately take care of its regular update.
    5. If there is a need to manage indexing by removing already indexed pages from Sitemap, you can have 2 different files - one to give robots, and another to keep for your own analysis.

    UV, it seems to be a simple subject, and the article is almost 1500 words. Congratulations on my writing and you - with reading. We both have no perpetual for!