3 reasons not to block GPTBot from crawling your site

[ad_1]

The next phase in ChatGPT’s rapid rise is adoption GPTBot. This new version of OpenAI’s technology includes crawling web pages to deepen the output ChatGPT can provide.

Improvement in AI seems positive, but is not that clear. Legal and ethical issues surround the technology.

The arrival of GPTBot has highlighted these concerns, as many major brands are blocking it instead of exploiting its potential.

Websites that block GPTBotWebsites that block GPTBot

But I truly believe there is much more to be gained than lost by fully (and responsibly) embracing GPTBot.

Why do AI bots like GPTBot crawl websites?

Understanding why bots like GPTBot do what they do is the first step to embracing this technology and realizing its potential.

Simply put, bots like GPTBot crawl websites to gather information. The key difference is that instead of an AI platform passively being given data to learn from (the “training set,” if you will), a bot can actively pursue information across the web by crawling different pages.

Great language models (LLMs) scour these websites in an attempt to understand the world around us. Google’s C4 dataset makes up a large portion (15.7 million sites) of the learning organization for these LLMs. They also crawl other authoritative, informative sites such as Wikipedia and Reddit.

The more sites these bots can crawl, the more they learn and the better they can become. Why then? companies that block GPTBot of crawling?

Do brands that block GPTBot have justified fears?

When I first read about companies blocking GPTBot from crawling their websites, I was confused and surprised.

It seemed incredibly shortsighted to me. But I thought there must be a lot to think about that I wasn’t thinking about deeply enough.

After researching and talking to agency professionals with a legal background, I discovered the biggest reasons.

Lack of compensation for their own training data

A lot of brands block GPTBot from crawling their site because they don’t want their data used without compensation in training the models. While I can understand wanting a piece of them $1 billion pieI think this is a short-sighted view.

ChatGPT, like Google and YouTube, is an answer engine for the world. Preventing your content from being crawled by GPTBot may limit your brand’s reach to a smaller group of Internet users in the future.

Security issues

Another reason behind the anti-GPTBot sentiment is security. While it is more valid than greedy data hoarding, it is still a largely unfounded concern from my perspective.

Top reasons why organizations ban ChatGPTTop reasons why organizations ban ChatGPT

By now all websites should be very Certainly. Not to mention that the content GPTBot is trying to access is public, non-sensitive content. The same things that Google, Bing and other search engines search every day.

What caches of sensitive information do CIOs, CEOs, and other business leaders think GPTBot will access during its crawl? And with proper safety measures, shouldn’t this be a non-issue?

From a legal perspective, the argument is that all crawls performed on a brand’s site should be covered by their privacy disclaimer. All websites must have a privacy disclaimer that explains how they use the data collected by their services. Advocates say this language should also mention that a third-party generative AI platform could crawl the data collected.

If not, personally identifiable information (PII) or customer data could still be “public” and brands could be exposed to a Federal Trade Commission (FTC) Section 5 claim for unfair and deceptive trade practices.

I understand this concern to some extent. If you’re the legal department of a major brand, one of your main objectives is to keep your company out of trouble. But this legal concern applies more to what is input go inside ChatGPT instead of what GPTBot crawls.

Everything entered into the OpenAI platform becomes part of the database and has the potential to be shared with other users, which could lead to data leaks. However, this would probably only happen if users asked questions about stored information.

This is also an unfounded concern for me because it can all be solved through responsible internet use. The same data principles we’ve used since the dawn of the internet still apply: don’t enter information you don’t want to share.

An impetus to save humanity from the advancement of AI

I can’t help but think that leaders at some of these brands blocking GPTBot are biased against the advancement of AI technology.

We often fear what we don’t understand, and some fear the idea of ​​artificial intelligence becoming bigger and bigger at a lot of knowledge and becoming at powerful.

While AI is rapidly developing and starting to ‘think’ more deeply, humans are still largely in control. Furthermore, AI legislation will grow along with the technology.

When we finally reach a world of “autonomous” AI platforms, their functionality will be determined for years to come human innovation and legislation.


Get the daily newsletter marketers trust.


3 reasons not to block ChatGPT’s GPTBot

So why should you let GPTBot crawl your site? Let’s look on the bright side with these three key benefits of embracing OpenAI bot technology.

1. 100 million people use ChatGPT every week

Not allowing GPTBot to crawl your site creates a 100 million people audience you miss out on maximum brand visibility.

By sharing access to your website content, you can ensure that your brand is presented both factually and positively to ChatGPT users.

This means your brand is more likely to actually be recommended by ChatGPT, leading to more traffic and potential customers.

Some brands report Generate 5% of their total leads, or $100,000 in monthly subscription revenue from ChatGPT. I know that our agency has also received some leads from ChatGPT.

Another way to look at this is as a positive digital PR (DPR) play. In today’s landscape, you need to leverage DPR strategies such as branding campaigns.

Allowing GPTBot to crawl your site only magnifies this effort by allowing ChatGPT to access your brand information directly from the source and positively distribute it to 100 million users.

2. Generative Engine Optimization (GEO)

Whether you’re afraid of AI, we can all agree that AI is changing the marketing landscape. Like all new technologies and trends in our industry, those who are slow to embrace AI as a channel for new business and brand awareness will miss the proverbial boat.

GEO is gaining ground as a sub-practice of SEO. You’re missing an important opportunity if you don’t focus some of your marketing efforts on this market. Competitors can pick it up after you let it slip through the cracks.

We know it’s easy for brands to fall behind in today’s fragmented and ever-growing marketing landscape. If your competitors spend years working on GEO, maximizing LLM visibility and developing skills and expertise in this area, they will be years ahead of you.

Now, GEO reporting capabilities haven’t caught up with value yet, which means it will be difficult to measure an ROI, but that doesn’t mean it’s something to ignore and fall behind.

Brands and marketers need to start embracing LLMs like ChatGPT as an emerging acquisition channel that shouldn’t be ignored.

3. OpenAI’s promise to minimize harm

A healthy distrust of AI technologies is important for their legal and ethical growth. But we also need to be open-minded and realize that as marketers we cannot be effective if we resist and choose not to grow and innovate in the direction of things.

OpenAI clearly states “minimize harm” as one of the guiding principles of their platform. They also have policies to respect copyright and intellectual property and have stated that GPTBot filters out resources that violate their policies.

By allowing GPTBot to crawl your site’s content, you contribute to the clean and accurate training data that OpenAI uses to improve the accuracy of the information.

As AI technology advances, it can be easy to get caught up in skepticism, fear, and noise. Those who struggle to embrace and maximize it will be left behind.

The opinions expressed in this article are those of the guest author and not necessarily those of Search Engine Land. Staff authors are credited here.

[ad_2]

Leave a Reply

Your email address will not be published. Required fields are marked *