drupal: standard way for creating a slug from a string
Asked Answered
C

10

14

A slug on this context is a string that its safe to use as an identifier, on urls or css. For example, if you have this string:

I'd like to eat at McRunchies!

Its slug would be:

i-d-like-to-eat-at-mcrunchies

I want to know whether there's a standard way of building such strings on Drupal (or php functions available from drupal). More precisely, inside a Drupal theme.

Context: I'm modifying a drupal theme so the html of the nodes it generates include their taxonomy terms as css classes on their containing div. Trouble is, some of those terms' names aren't valid css class names. I need to "slugify" them.

I've read that some people simply do this:

str_replace(" ", "-", $term->name)

This isn't really a enough for me. It doesn't replace uppercase letters with downcase, but more importantly, doesn't replace non-ascii characters (like à or é) by their ascii equivalents. It also doesn't remove "separator strings" from begining and end.

Is there a function in drupal 6 (or the php libs) that provides a way to slugify a string, and can be used on a template.php file of a drupal theme?

Cunnilingus answered 19/5, 2010 at 10:48 Comment(0)
M
16

You can use built in Drupal functions to do this.

$string = drupal_clean_css_identifier($string);
$slug = drupal_html_class($string);

functions will do the trick for you.

Maestoso answered 22/10, 2014 at 3:20 Comment(1)
I added the Drupal 8/9 way as an answer belowByers
G
11

i am a happy Zen theme user, thus i've met this wonderful function that comes with it: zen_id_safe http://api.lullabot.com/zen_id_safe

it does not depend on any other theme function, so you can just copy it to your module or theme and use it. it is a pretty small and simple function, so i will just paste it here for convenience.

function zen_id_safe($string) {
  // Replace with dashes anything that isn't A-Z, numbers, dashes, or underscores.
  return strtolower(preg_replace('/[^a-zA-Z0-9-]+/', '-', $string));
}

Gale answered 19/5, 2010 at 13:9 Comment(2)
This is nearly what I needed. However, it doesn't transliterate and it doesn't remove separators from the begining. Anyway, thanks for taking the time to answer.Cunnilingus
you can add logic for removing separators (note that it is only a requirement for id, since classes can use everything (see barney.w3.org/TR/REC-html40/struct/global.html#adef-class and click on cdata-list). as for proper transliteration, see my comment on googletorp's answer.Gale
C
11

I ended up using the slug function explained here (at the end of the article, you have to click in order to see the source code).

This does what I need and a couple things more, without needing to include external modules and the like.

Pasting the code below for easy future reference:

/**
 * Calculate a slug with a maximum length for a string.
 *
 * @param $string
 *   The string you want to calculate a slug for.
 * @param $length
 *   The maximum length the slug can have.
 * @return
 *   A string representing the slug
 */
function slug($string, $length = -1, $separator = '-') {
  // transliterate
  $string = transliterate($string);
 
  // lowercase
  $string = strtolower($string);
 
  // replace non alphanumeric and non underscore charachters by separator
  $string = preg_replace('/[^a-z0-9]/i', $separator, $string);
 
  // replace multiple occurences of separator by one instance
  $string = preg_replace('/'. preg_quote($separator) .'['. preg_quote($separator) .']*/', $separator, $string);
 
  // cut off to maximum length
  if ($length > -1 && strlen($string) > $length) {
    $string = substr($string, 0, $length);
  }
 
  // remove separator from start and end of string
  $string = preg_replace('/'. preg_quote($separator) .'$/', '', $string);
  $string = preg_replace('/^'. preg_quote($separator) .'/', '', $string);
 
  return $string;
}
 
/**
 * Transliterate a given string.
 *
 * @param $string
 *   The string you want to transliterate.
 * @return
 *   A string representing the transliterated version of the input string.
 */
function transliterate($string) {
  static $charmap;
  if (!$charmap) {
    $charmap = array(
      // Decompositions for Latin-1 Supplement
      chr(195) . chr(128) => 'A', chr(195) . chr(129) => 'A',
      chr(195) . chr(130) => 'A', chr(195) . chr(131) => 'A',
      chr(195) . chr(132) => 'A', chr(195) . chr(133) => 'A',
      chr(195) . chr(135) => 'C', chr(195) . chr(136) => 'E',
      chr(195) . chr(137) => 'E', chr(195) . chr(138) => 'E',
      chr(195) . chr(139) => 'E', chr(195) . chr(140) => 'I',
      chr(195) . chr(141) => 'I', chr(195) . chr(142) => 'I',
      chr(195) . chr(143) => 'I', chr(195) . chr(145) => 'N',
      chr(195) . chr(146) => 'O', chr(195) . chr(147) => 'O',
      chr(195) . chr(148) => 'O', chr(195) . chr(149) => 'O',
      chr(195) . chr(150) => 'O', chr(195) . chr(153) => 'U',
      chr(195) . chr(154) => 'U', chr(195) . chr(155) => 'U',
      chr(195) . chr(156) => 'U', chr(195) . chr(157) => 'Y',
      chr(195) . chr(159) => 's', chr(195) . chr(160) => 'a',
      chr(195) . chr(161) => 'a', chr(195) . chr(162) => 'a',
      chr(195) . chr(163) => 'a', chr(195) . chr(164) => 'a',
      chr(195) . chr(165) => 'a', chr(195) . chr(167) => 'c',
      chr(195) . chr(168) => 'e', chr(195) . chr(169) => 'e',
      chr(195) . chr(170) => 'e', chr(195) . chr(171) => 'e',
      chr(195) . chr(172) => 'i', chr(195) . chr(173) => 'i',
      chr(195) . chr(174) => 'i', chr(195) . chr(175) => 'i',
      chr(195) . chr(177) => 'n', chr(195) . chr(178) => 'o',
      chr(195) . chr(179) => 'o', chr(195) . chr(180) => 'o',
      chr(195) . chr(181) => 'o', chr(195) . chr(182) => 'o',
      chr(195) . chr(182) => 'o', chr(195) . chr(185) => 'u',
      chr(195) . chr(186) => 'u', chr(195) . chr(187) => 'u',
      chr(195) . chr(188) => 'u', chr(195) . chr(189) => 'y',
      chr(195) . chr(191) => 'y',
      // Decompositions for Latin Extended-A
      chr(196) . chr(128) => 'A', chr(196) . chr(129) => 'a',
      chr(196) . chr(130) => 'A', chr(196) . chr(131) => 'a',
      chr(196) . chr(132) => 'A', chr(196) . chr(133) => 'a',
      chr(196) . chr(134) => 'C', chr(196) . chr(135) => 'c',
      chr(196) . chr(136) => 'C', chr(196) . chr(137) => 'c',
      chr(196) . chr(138) => 'C', chr(196) . chr(139) => 'c',
      chr(196) . chr(140) => 'C', chr(196) . chr(141) => 'c',
      chr(196) . chr(142) => 'D', chr(196) . chr(143) => 'd',
      chr(196) . chr(144) => 'D', chr(196) . chr(145) => 'd',
      chr(196) . chr(146) => 'E', chr(196) . chr(147) => 'e',
      chr(196) . chr(148) => 'E', chr(196) . chr(149) => 'e',
      chr(196) . chr(150) => 'E', chr(196) . chr(151) => 'e',
      chr(196) . chr(152) => 'E', chr(196) . chr(153) => 'e',
      chr(196) . chr(154) => 'E', chr(196) . chr(155) => 'e',
      chr(196) . chr(156) => 'G', chr(196) . chr(157) => 'g',
      chr(196) . chr(158) => 'G', chr(196) . chr(159) => 'g',
      chr(196) . chr(160) => 'G', chr(196) . chr(161) => 'g',
      chr(196) . chr(162) => 'G', chr(196) . chr(163) => 'g',
      chr(196) . chr(164) => 'H', chr(196) . chr(165) => 'h',
      chr(196) . chr(166) => 'H', chr(196) . chr(167) => 'h',
      chr(196) . chr(168) => 'I', chr(196) . chr(169) => 'i',
      chr(196) . chr(170) => 'I', chr(196) . chr(171) => 'i',
      chr(196) . chr(172) => 'I', chr(196) . chr(173) => 'i',
      chr(196) . chr(174) => 'I', chr(196) . chr(175) => 'i',
      chr(196) . chr(176) => 'I', chr(196) . chr(177) => 'i',
      chr(196) . chr(178) => 'IJ', chr(196) . chr(179) => 'ij',
      chr(196) . chr(180) => 'J', chr(196) . chr(181) => 'j',
      chr(196) . chr(182) => 'K', chr(196) . chr(183) => 'k',
      chr(196) . chr(184) => 'k', chr(196) . chr(185) => 'L',
      chr(196) . chr(186) => 'l', chr(196) . chr(187) => 'L',
      chr(196) . chr(188) => 'l', chr(196) . chr(189) => 'L',
      chr(196) . chr(190) => 'l', chr(196) . chr(191) => 'L',
      chr(197) . chr(128) => 'l', chr(197) . chr(129) => 'L',
      chr(197) . chr(130) => 'l', chr(197) . chr(131) => 'N',
      chr(197) . chr(132) => 'n', chr(197) . chr(133) => 'N',
      chr(197) . chr(134) => 'n', chr(197) . chr(135) => 'N',
      chr(197) . chr(136) => 'n', chr(197) . chr(137) => 'N',
      chr(197) . chr(138) => 'n', chr(197) . chr(139) => 'N',
      chr(197) . chr(140) => 'O', chr(197) . chr(141) => 'o',
      chr(197) . chr(142) => 'O', chr(197) . chr(143) => 'o',
      chr(197) . chr(144) => 'O', chr(197) . chr(145) => 'o',
      chr(197) . chr(146) => 'OE', chr(197) . chr(147) => 'oe',
      chr(197) . chr(148) => 'R', chr(197) . chr(149) => 'r',
      chr(197) . chr(150) => 'R', chr(197) . chr(151) => 'r',
      chr(197) . chr(152) => 'R', chr(197) . chr(153) => 'r',
      chr(197) . chr(154) => 'S', chr(197) . chr(155) => 's',
      chr(197) . chr(156) => 'S', chr(197) . chr(157) => 's',
      chr(197) . chr(158) => 'S', chr(197) . chr(159) => 's',
      chr(197) . chr(160) => 'S', chr(197) . chr(161) => 's',
      chr(197) . chr(162) => 'T', chr(197) . chr(163) => 't',
      chr(197) . chr(164) => 'T', chr(197) . chr(165) => 't',
      chr(197) . chr(166) => 'T', chr(197) . chr(167) => 't',
      chr(197) . chr(168) => 'U', chr(197) . chr(169) => 'u',
      chr(197) . chr(170) => 'U', chr(197) . chr(171) => 'u',
      chr(197) . chr(172) => 'U', chr(197) . chr(173) => 'u',
      chr(197) . chr(174) => 'U', chr(197) . chr(175) => 'u',
      chr(197) . chr(176) => 'U', chr(197) . chr(177) => 'u',
      chr(197) . chr(178) => 'U', chr(197) . chr(179) => 'u',
      chr(197) . chr(180) => 'W', chr(197) . chr(181) => 'w',
      chr(197) . chr(182) => 'Y', chr(197) . chr(183) => 'y',
      chr(197) . chr(184) => 'Y', chr(197) . chr(185) => 'Z',
      chr(197) . chr(186) => 'z', chr(197) . chr(187) => 'Z',
      chr(197) . chr(188) => 'z', chr(197) . chr(189) => 'Z',
      chr(197) . chr(190) => 'z', chr(197) . chr(191) => 's',
      // Euro Sign
      chr(226) . chr(130) . chr(172) => 'E'
    );
  }
 
  // transliterate
  return strtr($string, $charmap);
}
 
function is_slug($str) {
  return $str == slug($str);
}
Cunnilingus answered 20/5, 2010 at 7:27 Comment(0)
P
6

There's also this from d7 which you can copy to your project:

http://api.drupal.org/api/function/drupal_clean_css_identifier/7

Peppermint answered 19/5, 2010 at 14:41 Comment(1)
It is nice to know that drupal has a function like this. However it doesn't do everything I needed (see my other answers). But +1 for the research effort.Cunnilingus
B
4

For Drupal 8/9 you can use Html::getClass

$slugify = Html::getClass('A @ Stríng-that n+eeds cónvert');

Don't forget to include the namespace when needed inside module

use Drupal\Component\Utility\Html;
Byers answered 7/2, 2021 at 15:56 Comment(1)
This should really be the correct answer these days, now that D7 is EOL.Supererogatory
T
2

This might help, I find I am doing this slugging all the time now rather then use id numbers as unique keys in my tables.

    /** class SlugMaker
    * 
    * methods to create text slugs for urls
    *
    **/

class SlugMaker {

    /** method slugify
    * 
    * cleans up a string such as a page title
    * so it becomes a readable valid url
    *
    * @param STR a string
    * @return STR a url friendly slug
    **/

    function slugifyAlnum( $str ){

    $str = preg_replace('#[^0-9a-z ]#i', '', $str );    // allow letters, numbers + spaces only
    $str = preg_replace('#( ){2,}#', ' ', $str );       // rm adjacent spaces
    $str = trim( $str ) ;

    return strtolower( str_replace( ' ', '-', $str ) ); // slugify


    }


    function slugifyAlnumAppendMonth( $str ){

    $val = $this->slugifyAlnum( $str );

    return $val . '-' . strtolower( date( "M" ) ) . '-' . date( "Y" ) ;

    }

}

Using this and .htaccess rules means you go directly from a url like:

/articles/my-pops-nuts-may-2010

Straight through to the table look up without having to unmap IDs (applying a suitable filter naturally).

Append or prepend some kind of date optionally in order to enforce a degree of uniqueness as you wish.

HTH

Thermolysis answered 19/5, 2010 at 13:17 Comment(2)
Thanks for posting this. The only thing I don't like about this function is that it leaves the separators at the begining and end of identifiers; if you have something like #1 - Option 1 it will get transformed into -1-option-1, which is not safe for use on css. A minor thing is that it doesn't transliterate.Cunnilingus
Upvote for example URL /articles/my-pops-nuts-may-2010Headwaiter
I
1

I would recommend the transliteration module which path_auto uses. With it you can use the transliteration_get() function. It also does unicode transformation.

Impeller answered 19/5, 2010 at 11:52 Comment(7)
pathauto does not use the transliteration module. it uses its own function pathauto_cleanstring() which depends on loads of settings from pathauto. drupalcontrib.org/api/function/pathauto_cleanstring/6Gale
@barraponto You can make pathauto use it to handle unicodes in urls, which is doesn't handle very well otherwise.Impeller
how do i get pathauto to use transliteration module? i've been looking for that... #2866242Gale
Thanks for posting this. However, I'll avoid adding additional modules if possible. I was looking for something provided directly by Drupal or PHP.Cunnilingus
i bet you are already using pathauto. it has a built-in transliteration file (i18n-ascii.txt) which will provide transliteration in pathauto_cleanstring(). you do NOT need the transliteration module.Gale
@barraponto, Transliteration is my recommendation, I haven't said that it's required or the only way.Impeller
@googletorp, i noticed, i just wanted to make it clear that transliteration is not needed since egarcia discarded your answer because of extra modules. pathauto_cleanstring () is enough, transliteration module is great for file paths (which pathauto does not handle). i highly recommend it.Gale
L
1

This is what worked for me after a lot of trial and error, including for converting both French as German titles with special characters to a slug.

I Created a custom twig filter so I can use it like this:

{{ node.field_title.value|slug }}

It will convert:

Wärmeabgabe & Abmessungen
Typenübersicht
Montage- und Anschlussmaße

Into:

warmeabgabe--abmessungen
typenubersicht
montage--und-anschlussmasse

for example.

HOWTO: In a custom module, create a services.yml file: modules/custom/mymodule/mymodule.services.yml

services:
 mymodule.twig_extensions:
    class: Drupal\mymodule\HelperTwigExtensions
    tags:
      - { name: twig.extension }

Create the modules/custom/mymodule/src/HelperTwigExtensions.php file:

<?php

namespace Drupal\mymodule;

use Drupal\Component\Utility\Html;

/**
 * Extend Drupal's Twig_Extension class.
 */
class HelperTwigExtensions extends \Twig_Extension {

  /**
   * {@inheritdoc}
   */
  public function getName() {
    return 'mymodule.twig_extensions';
  }

  /**
   * {@inheritdoc}
   */
  public function getFilters() {
    return [
      new \Twig_SimpleFilter('slug', [$this, 'createSlug']),
    ];
  }

  /**
   * Create a slug from a string input.
   */
  public function createSlug($input) {
    // Convert most of the special characters.
    $slug = Html::getClass($input);
    $slug = strtolower($slug);
    // Convert accented text characters.
    $unwanted_array = [
      'Þ' => 'b',
      'ß' => 'ss',
      'à' => 'a',
      'á' => 'a',
      'â' => 'a',
      'ã' => 'a',
      'ä' => 'a',
      'å' => 'a',
      'æ' => 'a',
      'ç' => 'c',
      'è' => 'e',
      'é' => 'e',
      'ê' => 'e',
      'ë' => 'e',
      'ì' => 'i',
      'í' => 'i',
      'î' => 'i',
      'ï' => 'i',
      'ð' => 'o',
      'ñ' => 'n',
      'ò' => 'o',
      'ó' => 'o',
      'ô' => 'o',
      'õ' => 'o',
      'ö' => 'o',
      'ø' => 'o',
      'ù' => 'u',
      'ú' => 'u',
      'û' => 'u',
      'ü' => 'u',
      'ý' => 'y',
      'þ' => 'b',
      'ÿ' => 'y',
    ];
    $slug = strtr($slug, $unwanted_array);
    return $slug;
  }

}
Lamellibranch answered 6/4, 2021 at 8:47 Comment(0)
G
0

You can use a preg_replace and strtolower :

preg_replace('/[^a-z]/','-', strtolower($term->name)); 
Glabella answered 19/5, 2010 at 11:36 Comment(2)
This is clean and simple. Unfortunately it doesn't do everything I need. But thanks for answering.Cunnilingus
I just found that basic theme implements what you're looking for this way : $string = strtolower(preg_replace('/[^a-zA-Z0-9_-]+/', '-', $string));Glabella
C
0

On Drupal 8/9, you can use the Pathauto Alias Cleaner service :

/** @var Drupal\pathauto\AliasCleaner $cleaner */
$cleaner = \Drupal::service('pathauto.alias_cleaner');
$cleaner->cleanString($node->getTitle());
Crutch answered 27/2, 2023 at 10:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.