How to add the robots.txt file to an AEM server to provide rules for Web Crawlers in AEM?
How to add robots.txt file in AEM/CQ?
Most you will refer to This Link To implement this.
Although it may seem to serve the purpose you will notice one thing that could be a little "not right".
Adding a robots.txt file directly in crxde causes the creation of a node of type nt:file in root level.
So when you hit http://localhost:4502/robots.txt instead of you displaying on the screen/browser the file downloads.
This is because of the Default GET servlet. The servlet identifies that the node type is nt:file and sends response with the content type as
Content-Type: application/octet-stream
Content-Disposition: attachment;filename=robots.txt
To overcome this implement the filter as follows. By doing this you will skip call to the Default GET Servlet of Sling and will be able to provide a content type of your own.
package com.hds.exp.filters;
import org.apache.felix.scr.annotations.sling.SlingFilter;
import java.io.IOException;
import java.io.PrintWriter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletRequest;
import org.apache.felix.scr.annotations.Properties;
import org.apache.felix.scr.annotations.Property;
@SlingFilter(order=1)
@Properties({
@Property(name="service.pid", value="com.hds.exp.filters.RobotsFilter",propertyPrivate=false),
@Property(name="service.description",value="Provides Robots.txt", propertyPrivate=false),
@Property(name="service.vendor",value="DD Exp", propertyPrivate=false),
@Property(name="pattern",value="/.*", propertyPrivate=false)
})
public class RobotsFilter implements javax.servlet.Filter{
@Override
public void destroy() {
// Unused
}
@Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
throws IOException, ServletException {
HttpServletRequest httpServletRequest =(HttpServletRequest) request;
if(httpServletRequest.getRequestURI().equals("/robots.txt"))
{
response.setContentType("text/plain");
PrintWriter writer = response.getWriter();
writer.print("User-agent: *");
writer.print("\n");
writer.print("Disallow: /");
writer.print("\n");
writer.flush();
}
else
{
chain.doFilter(request, response);
}
}
@Override
public void init(FilterConfig arg0) throws ServletException {
// Unused
}
}
only problem i have with this approach is now that if statement is being evaluated for every single request. somehow I feel leaving it to your web server would be preferable. –
Quark
© 2022 - 2024 — McMap. All rights reserved.